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This  study  critically  examines  the  ray  tracing  process 
used  in  the  generation  of  high-complexity  images  in  computer 
graphics  and  provides  design  parameters  for  hardware  which 
will  alleviate  bottlenecks  inherent  in  the  ray  tracing 
procedure.  A ray  tracing  algorithm  is  developed  and 
bottleneck  points  in  the  ray  tracing  algorithm  are  identified 
which  can  be  eliminated  by  hardware  implementation. 

To  develop  the  new  algorithm,  the  traditional  fast 
algorithms  are  studied.  By  combining  the  strengths  and 
considering  the  weak  points  of  various  algorithms,  a new 
procedure  is  proposed  that  eliminates  the  inherent  limitations 
of  the  basic  ray  tracing  process. 

The  new  algorithm  employs  sphere  bounding  volumes  to 
reduce  the  number  of  ray-object  intersections  in  the  basic  ray 
tracing  algorithm.  Traditional  bounding  volumes  are  used  to 
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bound  objects.  Sphere  bounding  volumes  in  the  new  algorithm 
are  used  to  bound  subspaces  which  could  contain  whole  objects 
or  parts  of  objects. 

The  sphere  bounding  volumes  are  sorted  with  respect  to 
ray  direction  for  each  ray.  Traditional  sorting  of  sphere 
bounding  volumes  need  to  calculate  a sguare  root.  To  avoid 
sguare  root  calculations,  we  develop  a comparison  algorithm 
which  uses  coefficients  of  guadratic  equations  for  sorting 
bounding  volumes . In  traditional  ray  tracing  algorithms  the 
greatest  computational  load  arises  from  the  calculation  of 
ray-object  intersections.  In  the  new  algorithm,  ray-object 
intersection  tests  start  from  the  nearest  bounding  volume.  If 
the  ray  hits  an  object  in  the  bounding  volume,  the 
intersection  test  is  terminated.  If  not,  an  intersection  test 
is  performed  on  the  next  nearest  bounding  volume. 

Since  the  new  bounding  volume  is  established  in  the  image 
space,  not  in  object  space,  we  must  check  whether  the 
intersection  point  is  in  the  bounding  volume  when  a ray  hits 
an  object.  For  this  test  we  develop  a simple  procedure  using 
the  coefficients  of  bounding  volumes.  The  performance  of  the 
new  algorithm  is  verified  with  computer  simulations.  We 
compare  two  outputs  which  are  produced  by  each  algorithm 
(traditional  ray  tracing  algorithm  and  new  algorithm)  and  show 
a substantial  reduction  in  overall  ray  tracing  calculation 
time.  Characteristics  of  hardware  modules  are  developed  which 
can  further  reduce  the  image  rendering  time. 
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CHAPTER  1 
INTRODUCTION 

1.1  Motivation 

The  quest  for  visual  realism  continues  to  be  a major 
research  area  in  computer  graphics.  The  thrust  is  to  achieve 
images  indistinguishable  from  a look  at  a real  scene,  i.e.  we 
desire  to  effect  a visual  environment  of  artificial  reality. 
Efforts  continue  to  devise  techniques  that  can  ever  more 
faithfully  account  for  visual  effects  in  computer  produced 
images.  Concurrent  with  the  search  for  more  realistic  image 
detail  calculations  is  a search  for  more  effective 
computational  techniques  for  well-understood  basic  techniques 
for  image  calculation. 

Computer  graphics  developers  are  continually  looking  for 
computationally  economic  techniques  to  simulate  virtual 
reality.  As  computers  have  become  more  powerful  and  graphical 
hardware  I/O  devices  more  prevalent,  photo  realism  has  been 
achieved.  Photo  realism  is  achieved  by  painting  on  the  display 
surface  an  image  that  focusses  onto  the  retina  of  an  observer 
a picture  that  would  normally  be  produced  by  a natural 
environment.  The  basic  underlying  technique  is  to  simulate, 
as  far  as  possible  within  the  constraints  of  the  resolution 
imposed  by  the  display  hardware,  a view  of  the  natural  scene. 
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The  basic  technique  is  to  put  on  the  screen  image  values 
that  are  those  produced  from  the  natural  scene  by  the  rays 
that  ultimately  are  focussed  on  the  retina.  The  real-world 
environment  produces  a plethora  of  rays,  scattering  light  in 
every  direction,  and  only  a tiny  fraction  ever  finds  its  way 
to  an  observer's  eye,  thus  producing  a directional  view  of  the 
real  scene.  To  calculate  all  the  rays  in  the  real  world  is 
wasteful  for  producing  a computer-displayed  image:  that  image 
is  only  one  of  the  infinity  of  rays  emanating  from  the  real 
scene,  and  all  the  other  views  are  not  observable  from  the 
observer's  viewpoint.  The  technique  of  calculating  all  rays 
is  termed  forward  ray  tracing;  however,  the  visible  scene, 
made  up  of  the  rays  entering  the  observer's  retina,  requires 
far  fewer  rays  to  be  considered.  These  rays  are  the  ones 
entering  the  eye,  and  their  production  from  elements  of  the 
real-world  scene,  can  be  mimicked  by  following  these  rays  from 
their  destination  (a  spot  on  the  retina)  to  the  sources  of  the 
light  whence  they  came.  This  technique  is  called  reverse  ray 
tracing,  or  often  termed  "ray  tracing". 

Ray  tracing  is  an  image  rendering  method  which  processes 
each  pixel  in  turn  and  finds  the  surface  point  in  the  three 
dimensional  scene,  of  which  a view  is  being  presented,  which 
determines  its  intensity  and  color.  The  image  is  not  painted 
on  the  retina,  but  rather  a display  screen,  which  in  turn  is 
focussed  by  the  observer.  On  the  screen  we  paint  a set  of 
pixels,  which  depict  the  desired  view  of  the  real  world.  The 
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ray  tracing  method  is  based  on  following  rays  from  the 
viewpoint  through  each  pixel  until  the  rays  meet  a surface  of 
an  object.  It  is  the  coloring  of  that  surface  point  that  is 
painted  as  the  color  of  that  pixel. 

The  ray  tracing  algorithm  itself  allows  the  incorporation 
of  many  visual  effects  in  a straightforward  manner.  Adding  the 
same  effects  into  other  three-dimensional  computer  graphics 
techniques  is  much  more  difficult,  if  not  impossible  [Lin  92]. 
The  technique  of  ray  tracing  resulted  from  the  endless  pursuit 
for  photo  realism. 

Ray  tracing  produces  high  quality  images,  at  a high 
computational  cost.  One  of  the  biggest  costs  is  the 
calculation  of  the  visible  object  element  at  each  pixel 
location.  The  algorithm  must  find  the  nearest  object  point 
from  the  location  of  the  view  point.  Therefore  the  heart  of 
any  ray  tracing  package  is  the  set  of  ray  intersection 
routines.  No  matter  what  kind  of  techniques  are  applied  to  ray 
tracing,  there  is  always  the  need  to  find  the  intersection 
point  of  a ray  and  an  object.  The  basic  ray  tracing  algorithm 
is 

for  (each  pixel  on  the  display) 
for  (each  object) 

find  the  nearest  surface  point 
retain  the  nearest  surface  point 
calculate  the  color  of  that  point 
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For  example  if  the  screen  resolution  is  1000  x 1000  and 
if  there  are  100  objects  in  the  scene,  the  basic  algorithm 
will  require  at  least  100  million  ray  intersection 
calculations  of  which  one  will  be  used  to  calculate  the 
picture  coloring.  If  the  objects  themselves  are  defined  by 
complex  methods,  each  intersection  calculation  will  also  take 
a considerable  amount  of  time.  Even  the  largest  super  computer 
would  find  this  computing  requirement  hard  to  satisfy  within 
a reasonable  running  time. 

Since  the  determination  of  each  pixel  color  does  not 
depend  on  the  other  pixels,  parallel  processing  is  possible 
pixel  by  pixel.  But  because  the  intersection  calculation  of 
some  pixel  takes  a very  long  time,  real-time  interactive 
simulation  is  impossible  for  realistic  image  scenes. 

To  get  the  high  quality  picture  desired  for  virtual 
reality  applications  using  the  ray  tracing  algorithm,  it  is 
critically  important  to  avoid  wasting  computation  time  on 
checking  the  ray  against  objects  that  have  no  intersection  and 
which  can  be  trivially  eliminated.  The  reduction  of  the  number 
of  intersection  calculations  may  be  done  by  many  software 
optimization  methods.  All  of  those  methods  depend  on 
eliminating  those  unintersected  objects. 

Still  one  of  the  greatest  challenges  of  ray  tracing  is 
efficient  execution.  Despite  its  impressive  image  rendering 
capability,  ray  tracing  is  often  dismissed  as  being  too 
computationally  exorbitant  to  be  useful.  Therefore  efficiency 
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is  a critical  issue  and  has  been  the  focus  of  much  research 
from  the  beginning.  This  has  led  to  many  creative  approaches. 
Decreasing  computing  time  can  be  achieved  both  by  software 
improvements  and  hardware  additions . 

1.2  Problem  Definition 

The  reduction  of  the  number  of  intersection  calculations 
may  be  done  by  the  following  four  approaches. 

1.  Bounding  Volume  Method 

2 . Hierarchical  Bounding  Volume  Method 

3.  Uniform  Spatial  Subdivision  Method 

4.  Non-Uniform  Spatial  Subdivision  Method 

Each  approach  has  its  own  advantages  and  disadvantages . 
Because  the  intersection  test  is  simple  and  no  hierarchies  are 
required,  the  sphere  bounding  volume  algorithm  is  the  simplest 
one.  Here  objects  are  bounded  by  a sphere  extending  to  the 
objects'  maximal  extent  in  the  image  space.  One  needs  to  check 
only  if  a bounding  volume  lies  in  the  pixel's  position. 

Thus  simple  test  can  be  used  to  trivially  reduce  the 
number  of  candidate  objects  that  need  be  considered  for  the 
full  intersection  calculation.  Only  those  objects  that  meet 
the  location  test  are  then  considered  further. 

Consider  the  case  shown  in  Figure  1.1.  Here  a long  thin 
object,  say  a slender  rod,  is  being  rendered.  Since  the 
bounding  volume  has  a characteristic  of  an  easy  intersection 
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Figure  1 . 1 Worst  Case  of  Bounding  Volume  Algorithm 
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test  with  the  ray,  the  possible  bounding  volume  could  be  a 
sphere.  When  we  apply  the  sphere  to  this  object  as  a bounding 
volume,  most  of  the  rays  which  intersect  the  bounding  volume 
will  in  fact  not  intersect  the  object.  Thus  many  of  the 
calculations  will  be  for  naught.  In  fact,  using  the  bounding 
volume  approach  may  easily  INCREASE  the  number  of  reguired 
operations.  Let's  consider  the  other  case  as  shown  in  Figure 
1.2.  Here  the  bounding  volume  encloses  several  objects.  In 
this  case  most  of  the  rays  which  intersect  the  bounding  volume 
will  intersect  an  object  or  a few  objects.  The  sphere  bounding 
volume  in  Figure  1.2  shows  the  main  idea  of  the  bounding 
volume  method. 

The  critical  question  is  how  to  apply  the  sphere  bounding 
volume  to  every  environment  in  the  image  space,  so  as  to 
achieve  the  best  efficiency  as  shown  in  Figure  1.2.  The  goal 
of  the  present  work  is  to  address  this  problem. 

The  specific  objectives  of  this  study  are  as  follows: 

1.  Analyze  the  ray  tracing  process  to  assess  the  computing 
requirements  in  its  various  phases. 

2.  Develop  a new  speed  up  procedure  for  ray  tracing. 

3.  Implement  this  approach. 

4.  Compare  the  efficiency  of  the  proposed  approach 
experimentally  with  that  of  the  original  ray  tracing 
algorithm . 


Figure  1.2  Best  Case  of  Bounding  Volume  Algorithm 
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1.3  Overview  of  the  Dissertation 
The  dissertation  consists  of  six  chapters.  Chapter  1 
explains  the  uses  of  ray  tracing  and  discusses  the 
computational  problems  inherent  in  it.  Chapter  2 critically 
reviews  the  ray  tracing  process.  Chapter  3 discusses  the 
traditional  fast  algorithms  used  in  implementing  the  ray 
tracing  procedure.  The  main  problems  of  each  fast  algorithm 
are  summarized  also.  A new  fast  algorithm  on  ray  tracing  is 
presented  in  Chapter  4.  Results  of  simulations  which  were  used 
to  verify  the  performance  of  the  new  fast  algorithm  are 
presented  in  Chapter  5.  The  final  chapter  summarizes  this 
study  and  suggests  directions  for  future  research. 


CHAPTER  2 

THE  RAY  TRACING  PROCEDURE 


2 . 1 Background 

The  objective  of  the  ray  tracing  process  is  to  calculate 
an  image  that  is  a faithful  reproduction  of  a scene,  be  it 
natural  or  an  imagined  one.  The  test  of  a well-rendered  ray 
traced  image  is  in  the  fidelity  with  which  a natural  scene  can 
be  rendered  from  a geometric  specification  of  objects  in  the 
scene,  the  surface  properties  of  those  objects,  and  a 
description  of  the  illumination  of  the  scene.  The  image  is 
displayed  on  a workstation  screen  and  is  viewed  by  an 
observer.  The  image  of  a natural  scene  seen  by  the  observer 
should  be  indistinguishable  from  a view  of  a natural  scene. 
It  is  therefore  important  to  review  just  what  the  eye  can  see. 

Normal  color  vision  perceives  images  in  many  colors  and 
with  a high  degree  of  image  fineness  (acuity).  There  are 
basically  two  parts  of  the  field  of  vision:  peripheral  and 
foveal.  It  is  the  foveal  vision  that  has  a high  degree  of 
spatial  precision;  peripheral  vision  is  less  spatially  acute, 
but  has  the  ability  of  detecting  motion  (without  detail)  and 
conveys  to  the  user  a sense  of  presence  in  the  scene.  Foveal 
acuity  and  color  resolution  has  been  studied  in  detail  [Sou 
61];  peripheral  vision  is  less  acute  and  less  discriminating 
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in  color.  Foveal  vision  exists  in  the  visual  center  of  a view 
field,  and  subtends  only  about  one  degree  of  arc. 

The  view  field  for  a normal  eye  is  almost  180°  left-to- 
right  and  almost  180°  top-down.  Normal  human  vision  is 
binocular:  the  left  and  the  right  eyes  perceive  slightly 
different  images.  The  central  part  of  the  visual  field  is 
common  to  both  eyes,  resulting  in  an  overlapped  binocular 
field  of  some  150°  high  and  about  120°  laterally.  The  visual 
cortex,  the  image  receptor  in  the  human  observer,  derives 
three-dimensional  information  from  the  parallax  between  the 
two  views  [Gra  63]. 

The  spacing  of  receptors  in  the  foveal  region  is  about  2 
to  3 jum,  and  the  focal  distance  of  the  eye  in  15  mm.  This 
results  in  a physiological  resolution  of  about  0.5  to  0.7 
minutes  of  arc.  The  eye  in  fact  can  resolve  even  finer 
details  on  structured  targets  [Lux  68]. 

In  forming  an  image  on  its  receptor  surface,  the  retina, 
the  human  eye  is  basically  an  optical  device.  Every  optical 
device  suffers  from  chromatic  aberration.  The  eye  will  focus 
on  green-yellow  light,  making  the  red-light  focus  slightly 
behind  the  retina,  and  the  blue-light  focal  plane  slightly  in 
front  of  the  retina.  Most  measurements  of  visual  acuity, 
image  flicker  rate  and  similar  data,  use  white  light  with 
implicit  color  aberrations. 

The  central-field  visual  acuity  varies  with  the  type  of 
image.  Astronomers,  before  the  availability  of  telescopes. 
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depended  heavily  on  acute  eyesight;  a job  requirement  was 
exceptionally  keen  vision,  so  that  they  could  tell  stars  apart 
on  a dark  background.  The  normal  discrimination  for  this  task 
is  about  2.9  x 10^  radians  (about  1 minute  of  arc);  for  long 
lines  on  a self-luminous  background  (spider  webs  lit  from 
behind),  the  minimum  visual  angle  is  about  8 arc-seconds  [Lux 
68]  . 

The  distribution  of  visual  receptors  is  not  uniform  in 
the  human  eye.  The  density  decreases  markedly  toward  the 
periphery.  Rods  (light  intensity  detectors)  number  110  to  130 
million,  and  cones  (color  detectors)  3 to  7 million.  Hence  if 
it  were  possible  to  focus  a computer-produced  image  directly 
on  the  visual  detectors  of  an  observer's  eye,  one  should  be 
able  to  evoke  a full  monocular  image  consistent  with  the  real- 
world  environs  with  about  100  million  picture  elements.  Since 
the  "normal"  workstation  display  has  about  1 million  pixels, 
technology  is  within  two  orders  of  magnitude  to  produce  images 
that  a human  observer  would  not  be  able  to  distinguish,  at 
least  in  image  acuity,  from  a view  of  the  real  world. 

Unfortunately  the  eye  is  in  constant  small  motion  when  a 
view  is  perceived.  This  small  motion  is  a tiny  random,  or 
nearly  random  motion,  the  saccadic  motion  of  the  eye.  We 
should  note  that  during  saccadic  eye  motion  the  observer  (at 
least  the  human  observer)  is  blind,  else  a motion  blur  would 
interfere  with  the  image  being  "seen",  i.e.  processed  for 
understanding.  The  mechanism  of  this  brief  visual  blackout  is 


13 


not  well  understood  at  this  time,  but  indicates  a highly 
complex  interaction  between  the  pure  visual  processing  centers 
in  the  brain  and  the  eye  motion  control  centers.  In  this  eye 
motion,  apparently  the  vision  detectors  are  "scanned"  relative 
to  the  stationary  image;  image  data  processing  in  the  nerve 
net  underlying  the  receptors  allows  the  eye  to  detect  features 
in  an  image  which  are  somewhat  finer  than  the  spacing  of  rods 
in  the  retina. 

Saccadic  motion  requires  that  a picture  be  rendered  on 
the  computer  display  with  as  fine  a resolution  as  the  best  eye 
resolving  power,  i.e.  about  1/10  degree  of  arc  in  the  areas 
that  the  eye  might  be  roving  over  during  normal  vision.  At 
the  most  this  will  be  a total  view  angle  of  180°.  Hence  the 
number  of  pixels  required  will  be  about  180X60X10  « 100,000  in 
each  the  horizontal  and  vertical  directions  for  a hemispheric 
panoramic  display.  This  would  require  10‘°  pixels,  about  four 
orders  of  magnitude  higher  than  contemporary  "high  resolution" 
workstation  displays.  Thus  even  with  the  best  of  current 
interactive  display  devices  we  can  produce  only  a defocussed 
approximation  of  a real-world  image. 

In  general,  color  is  perceived  from  the  stimulation  of 
various  color  receptors  in  the  light  detection  organ  of  the 
perceiver;  these  are  vastly  different  in  various  living 
organisms  [Cro  94].  Hence  the  sensation  of  color  is  a species- 
dependent  phenomenon.  In  some  creatures  there  is  a highly 
developed  color  detection  system  favoring  blue  colors,  as  in 
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fish.  Thus  when  we  speak  of  color,  we  really  need  to  specify 
what  organism  we  use  as  a referent.  Clearly,  in  computer 
graphics,  we  mean  "color"  to  be  a human  experience.  Humans 
perceive  color  from  three  primary  color  receptors,  which  have 
relatively  broad  frequency  responses  that  overlap  in  the 
visible  spectrum.  Each  have  a primary  response  in  the  Red 
(590  nm),  the  Green  (500  nm)  and  the  Blue  (470  nm)  [Cro  94]. 
The  combination  of  the  responses  evokes  the  recognition  of 
color.  This  is  the  basis  of  the  three-color  (tri-stimulus) 
system  of  the  common  TV,  which  defines  the  standard  colors 
used  in  a normal  workstation  display.  Any  color  shown  is  a 
mixture  of  the  three  primary  components . In  generating  a 
color  image,  three  primary  color  components  must  be  produced 
for  each  display  pixel. 

The  color  resolution  of  the  human  visual  system  is 
usually  measured  in  terms  of  color  saturation  and  color 
purity.  Saturation  measures  the  number  of  steps  that  are 
perceived  between  a "pure"  color  (such  as  a spectral  color) 
and  white  (composed  of  3 equal  parts  of  the  three  Red-Green- 
Blue  primaries).  Such  a measurement  shows  that  the  human  can 
distinguish  around  a hundred  different  levels  of  saturation 
[Gra  63  ] . 

The  color  purity  discrimination  is  a bit  sharper;  a 
change  of  wavelength  of  around  3 nm  can  be  perceived  in  the 
yellow-green  part  of  the  spectrum  (about  550  nm) . Hence  the 
color  purity  is  more  critical,  amounting  to  about  ^ % in  color 
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fidelity  [Moo  61].  Thus  specifying  color  information  to  an  8- 
bit  accuracy  is  consistent  with  human  color  perception; 
however,  for  precise  applications,  when  the  nonlinear 
characteristics  of  the  phosphors  in  the  display  surface  need 
to  be  considered,  a primary  color  specification  of  at  least 
ten  bits  is  necessary  [Mar  82]. 

Thus  the  color  display  that  is  to  produce  an  image  of  the 
real  world  should  show  a pair  of  images,  one  for  each  eye, 
each  display  should  subtend  a visual  angle  of  some  180°,  each 
should  have  about  10*”  pixels,  each  pixel  should  be  capable  of 
displaying  three  primary  colors,  with  a color  resolution  of 
about  10  bits.  Clearly  we  do  not  now  have  affordable 
technology  to  approach  these  numbers;  currently  we  are  4 to  5 
orders  of  magnitude  away  from  these  numbers . 

One  other  aspect  of  a monocular  view  of  a natural  scene 
is  inherent  depth  information  in  the  wavefront.  An  observer 
can  easily  focus  on  near  objects,  or  far  ones,  thus  deriving 
some  depth  information.  This  is  easily  demonstrated  in  a view 
out  of  a window:  the  observer  can  easily  ignore  (i.e.  NOT 
focus  on)  smudges  on  the  glass  pane,  but  sharply  observe 
distant  scenery.  The  image  which  one  normally  produces  on  a 
workstation  display  surface  lacks  the  focussing  feature. 
Normally  a fixed-focus  distance  (usually  infinite)  is  used  for 
the  production  of  a scene.  Such  an  image  is  then  painted  to 
a screen  at  a fixed  distance  from  the  observer.  In  this 
respect  a computer-produced  image  is  yet  another  approximation 
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to  a natural  scene.  This  refinement  in  imaging  is  detail  is 
usually  ignored  in  artificially  produced  imaging. 

Therefore,  at  best,  current  technology  can  produce  only 
an  approximation,  notably  of  lower  spatial  resolution,  but 
close  to  real-life  colors.  Most  of  the  work  in  this 
dissertation  will  deal  with  workstation  displays,  with  a 
resolution  of  about  1000  x 1000  pixels.  The  visual  image 
normally  subtends  about  30°  to  60°. 

To  maintain  the  illusion  of  stationary,  or  slowly  moving, 
images,  an  image  sequence  is  painted  on  the  display  surface. 
The  human  visual  cortex  will  fuse  sequences  of  images  and 
evoke  the  sensation  of  a stationary  or  smoothly  moving  image, 
without  the  appearance  of  image  flicker,  if  sequential  images 
are  repeated  with  a high  enough  rate.  This  image  fusion 
frequency  depends  on  the  overall  image  brightness.  For  a 
darkened  room,  such  as  a movie  auditorium,  a frequency  of 
24/second  is  adequate.  For  a dimly-lit  daytime  room,  normal 
for  television  viewing,  the  rate  rises  to  around  30/second; 
and  for  normal  daytime  brightness  the  rate  may  exceed 
60/second.  Current  high-performance  workstations  use  display 
refresh  rates  of  about  60  to  70/second. 

We  would,  of  course,  like  to  have  our  display  image 
calculated  AND  displayed  at  the  maximum  rate  required  by  these 
physiological  demands.  Thus  we  need  to  contemplate  the 
calculation  of  image  data  at  a rate  of  at  least  30  mega-pixels 
per  second,  each  having  Red,  Green  and  Blue  components. 
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The  last  item  we  need  to  discuss  is  just  what  should  be 
on  each  pixel.  In  the  natural  world  each  receptor  area  will 
receive  light  from  a small  cone  segment,  a surface  patch  of 
about  1/10  arc-minute  across.  Each  of  these  patches  is  the 
sum  of  all  light  rays  that  are  focussed  on  that  small  area. 
In  calculating  an  artificial  image  of  a real  scene,  we  need  to 
remember  that  each  image  pixel  is  a small-area  integral  that 
we  need  to  calculate.  We  seldom  have  the  luxury  of 

integrating  over  however-small  angular  cones;  we  normally 
sample  the  scene  for  one  (or  a few)  point(s)  for  each  pixel. 
This  is  the  major  cause  of  picture  aliasing:  the  image 
elements  are  defined  with  higher  resolution  than  is  possible 
to  paint  on  the  display  surface.  Anti-aliasing  techniques, 
not  a subject  explored  in  this  work,  deal  with  refining  the 
coloring  information  in  an  attempt  to  account  for  the  lack  of 
spatial  display  ability. 

Thus  the  computational  requirements  for  rendering  even  a 
modest  scene  may  easily  exceed  any  reasonable  computing 
resources.  It  is  therefore  very  important  to  look  at 
procedures  that  ease  the  image  computation  task.  Two 
promising  major  avenues  of  approach  are: 

1.  Ways  of  parallelizing  the  computation  task; 

2 . Algorithmic  approaches  to  identify  the  computational 

bottlenecks  and  find  effective  speed-up  procedures. 
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This  dissertation  focusses  on  the  second  of  these 
approaches.  We  will  introduce  a process  for  speeding  up  the 
basic  ray  tracing  process.  We  will  use  a uniprocessor  for 
accomplishing  the  basic  ray  tracing  task,  and  will  use  the 
same  processor  to  implement  the  speed  up.  Certainly  the 
process  can  be  adapted  to  parallel  computations,  thus 
conveying  an  added  benefit  to  inherently  faster  computational 
complexes . 


2.2  The  Ray  Tracing  Process 

In  a natural  environment  all  rays  that  make  up  a visible 
scene  in  the  observer's  eye  start  at  various  light  sources, 
then  are  scattered,  reflected  and  directed  into  the  pupil  of 
the  observer's  eye.  Clearly,  most  of  the  light  is  lost  to  the 
observer;  those  parts  of  the  scene  image  are  not  observable 
from  the  location  of  the  observer.  To  mimic  this  process 
exactly  on  a computer,  i.e.  to  trace  all  light  rays  emanating 
from  the  scene,  would  require  the  calculation  of  a great  many 
light  paths  which  would  then  be  not  utilized  in  forming  an 
observer's  image.  This  wasteful  process  is  known  as  Direct 
Ray  Tracing;  the  procedure  models  the  action  of  light  sources 
on  the  scene  to  produce  all  observable  views  of  it.  The 
procedure  is  not  practical. 

For  reasons  of  economy  in  calculation  it  is  necessary 
only  to  compute  those  rays  that  in  fact  will  be  used  in 
forming  the  ultimate  display  image.  Hence  one  starts  with  the 
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rays  at  the  image  plane  and  traces  the  rays  backwards  to  their 
sources.  The  process  is  known  as  Reverse  Ray  Tracing.  It 
avoids  the  computational  inefficiency  inherent  in  Direct  Ray 
Tracing.  It  is  the  Reverse  Ray  Tracing  process  which  is 
commonly  known  as  Ray  Tracing;  it  is  the  process  referred  to 
as  Ray  Tracing  in  this  work. 

As  a further  concession  to  practicality,  the  scene 
focussing  is  replaced  by  a simple  pinhole  camera,  rather  than 
getting  involved  in  optical  aperture  complications  [Pot  82]. 
Hence  the  image  to  be  produced  will  be  a pinhole  camera  view 
of  a scene,  thus  a fixed  focus,  and  by  tracing  the  rays  that 
make  up  the  image  back  to  the  light  sources,  accounting  for 
interactions  of  objects  in  the  scene  with  the  rays. 

The  rays  that  make  up  the  image  are  taken  to  be  one  for 
each  pixel  that  appears  on  the  final  image  rendering.  Hence 
for  a VGA  image,  one  would  need  480  rows  x 640  picture 
elements  = 307,200  pixels  (i.e.  rays).  In  a "normal 

workstation"  there  would  be  about  1000  rows  and  1250 
pixels/row,  or  about  1.2  million  pixels.  For  each  of  these 
rays  we  need  to  calculate  where  they  intersect  the  nearest 
object  in  the  scene.  It  is  not  uncommon  for  a scene  to 
contain  several  hundred  objects.  The  process  basically  is: 
For  each  ray; 

determine  where  each  object  intersects ; 

determine  the  color  for  each  of  these  points; 
paint  this  color  to  the  pixel; 


Display  the  whole  image. 
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If  the  scene  consists  of  N objects,  the  number  of 
intersection  calculations  for  a "simple  VGA"  image  is 
basically  300,000xn  operations;  for  a workstation  image,  there 
may  be  four  times  as  many  calculations . Each  of  these 
calculations  may  require  the  determination  of  the  intersection 
point  of  a ray  with  a surface,  which  may  be  a curved  surface 
in  three-dimensions.  These  calculations  are  involved  and  may 
require  many  thousands  of  machine  cycles  [Bli  80]. 

Once  the  intersection  point  is  determined,  calculation  of 
the  color  of  that  point  may  involve  a great  deal  of  additional 
calculations.  These  calculations  are  called  the  rendering 
calculations;  they  determine  the  color  reflected/emitted  by 
the  intersection  point  with  the  viewing  ray  representing  a 
direction  of  the  nearest-ob ject  for  a particular  ray. 
Depending  on  the  nature  of  the  visible  surface  point,  this 
calculation  may  be  simple  or  very  complex,  especially  if  that 
visible  point  reflects  light,  or  is  perhaps  a light  refracting 
surface.  For  reflected  light  and  refracted  light  secondary 
rays  need  be  examined  which  result  from  the  light 
reflection/refraction  properties  of  that  particular  surface 
point.  Conceptually,  one  needs  to  determine  the  surface 
normal  relative  to  the  incident  ray,  and  construct  rays  that 
further  follow  the  imaged  ray,  all  the  way  to  the  source  of 
the  light  that  produces  the  surface  coloring.  There  may  be  a 
number  of  secondary  rays;  their  intersection  with  the  objects 
in  the  scene  need  then  be  determined.  This  calculation  is 
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basically  the  same  as  the  calculation  of  the  ray  emanating 
from  the  display  plane;  however,  since  there  may  be  many  such 
secondary  rays,  the  amount  of  calculations  may  become  quite 
large . 

It  is  the  objective  of  this  dissertation  to  examine  the 
ray-tracing  image  production  process,  to  identify  those  of  its 
steps  that  are  computation  intensive,  and  to  offer  palliative 
measures  for  the  computational  bottlenecks. 

2.3  Uniprocessor  Implementation 

The  fundamental  ray  tracing  program  has  three  basic 
parts:  input  of  scene  and  viewing  parameters,  calculation  of 
the  visible  picture  elements,  and  production  of  the  visible 
image  to  a display  device  or  a storage  medium.  There  is 
considerable  interaction  between  the  three  segments, 
especially  when  the  image  needs  to  be  produced  in  real  time, 
as  with  dynamic  virtual  reality  environments.  However,  the 
conceptual  program  flow  can  be  considered  to  be  the  sequential 
execution  of  the  three  parts  listed  above.  Together  with  the 
computational  tasks,  the  fundamental  ray  tracing  program  is 
diagramed  in  Figure  2.1  [Hec  89]. 

Since  the  objective  of  this  dissertation  is  the 
examination  of  the  computational  loads  for  the  ray  tracing 
process,  we  will  concentrate  on  the  second  phase  of  the 
program  and  omit  critical  examination  of  the  initialization, 
input  and  output  tasks.  Furthermore,  we  assume  that  the 
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Initialize  storage,  input  and  output  files 

Input  object  geometry  data,  object  surface  properties,  view 
parameters 


begin 

vector  : inter section_point,  ref lected_direction, 
transmitted_direction 

colors  : local_color,  ref lected_color , transmitted_color 

color  <-  black  ; 
if  (depth  ^ maxdepth) 

begin 

color  <-  back_ground_color ; 

[intersect  ray  with  all  objects  and  find  intersection 
point  (if  any)  that  is  closest  to  start  of  ray] 
if  (intersection) 

begin 

local_color  <-  { contribution  of  local  color  model  at 

intersection  point  } 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( intersection_point , ref lected_direction, 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( inter sect ion_point , transmitted_direction , 
depth+1 , transmitted_color ) 

Combine  (color,  local_color,  local_weight_for_surface, 
ref lected_color, ref lected_weight_for_surface, 
transmitted_color , transmitted_weight_f or_surf ace ) 

end 

end 

end 

Output  image  file 


Figure  2 . 1 Ray  Tracing  Program 
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computational  task  is  not  further  burdened  with  data  access 
limitations,  i.e.  there  is  sufficient  memory  to  hold  all 
required  data  for  the  various  calculations. 

We  note  from  Figure  2 . 3 that  the  entire  computational 
task  is  dominated  by  the  intersection  test,  which  determines 
the  intersection  point  of  a given  ray  with  the  nearest  object. 
Fundamentally,  this  intersection  test  must  be  performed  for 
all  objects  in  the  scene,  but  only  one  object  will  be  retained 
for  further  calculation  of  reflected  or  refracted 
(transmitted)  light. 

We  note  that  for  a uniprocessor  implementation  the 
various  computational  tasks  are  performed  sequentially.  For 
timing  the  intersection  calculations  we  need  to  assure  that 
the  various  other  sections  are  excluded  from  the  timing 
measurements.  We  also  note  that  no  shading  calculations  can 
be  made  until  the  visible  surface  element  is  found  in 
establishing  the  coloring  of  an  image  pixel. 

2.4  Possible  Bottlenecks 

From  an  analysis  of  the  algorithm  we  note  that  the 
predominant  computational  bottleneck  is  in  the  intersection 
calculations,  with  secondary  choke  points  in  calculating 
reflected  and  transmitted  (i.e.  refracted)  rays,  as  well  as  in 
calculating  the  coloring  of  a surface  element.  If  the 
surfaces  of  the  objects  are  curved,  then  the  calculation  of 
surface  normals,  needed  for  establishing  the  apparent  surface 
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coloring,  can  become  computationally  complex.  These  secondary 
effects  have  been  studied  extensively  [Bli  80],  and  have  been 
subject  to  extensive  research  [Whi  80],  While  we  do  not  wish 
to  minimize  these  computational  tasks,  we  note  that  the  first 
choke  point  is  finding  the  nearest-ob ject  surface  element  that 
is  needed  for  any  further  coloring  calculations.  The  problem 
is  basically  finding  the  intersection  points  of  all  objects 
with  a ray,  and  then  finding  that  point  which  is  nearest  to 
the  origin  of  the  ray.  This  is  implicitly  a sorting  process; 
however,  we  can  make  the  list  of  elements  to  be  sorted  very 
short,  if  we  find  a computationally  efficient  way  of  ignoring 
all  those  objects  which  can  not  intersect  a given  ray  (because 
they  are  located  far  away  from  the  ray  as  it  radiates  into  the 
object  scene  from  the  position  of  the  observer) . 

The  following  chapters  outline  such  a method. 

2.5  Possible  Speedups 

From  an  examination  of  the  basic  ray  tracing  process, 
there  are  several  parts  of  the  process  which  can  easily  lead 
to  computational  speedups.  These  are  image  clipping, 
shortcuts  in  the  object  sorting  process,  and  parallelization. 
We  briefly  discuss  each  of  these  in  the  following  paragraphs. 

2.5.1  Image  Clipping 

Since  basic  ray  tracing  examines  each  object  of  the 
natural  scene  to  determine  which  one  is  closest  to  the 
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viewpoint,  the  viewpoint  can  be  located  anywhere,  including 
inside  of  the  object  assembly  that  makes  up  the  final  scene. 
This  is  merely  a computerized  version  of  building  a scene,  for 
example  a movie  set,  and  moving  a camera  toward,  around  and 
into  it.  At  various  positions  of  the  path  of  the  camera 
images  are  constructed,  which  become  the  sequence  of  frames 
that  define  a dynamic  image,  an  animated  image  sequence. 
There  is  no  need  to  discard  any  of  the  objects  that  make  up 
the  scene,  even  if  those  parts  are  invisible  to  the  viewer, 
such  as  elements  of  the  scene  that  may  become  located  behind 
the  observer.  One  could  cull  the  objects  and  retain  only 
those  in  the  general  direction  of  the  view,  and  thereby 
greatly  reduce  the  required  calculations.  Various  image 
clipping  algorithms  exist  which  can  provide  a list  of  objects, 
and  parts  of  the  original  objects,  which  will  appear  in  the 
final  image.  Some  objects  will  be  only  partly  in  the  final 
scene;  these  elements  must  be  clipped  and  the  parts  outside  of 
the  image  extent  must  be  discarded. 

Since  the  objects  in  the  image  may  be  anything,  the 
clipping  must  be  able  to  handle  arbitrary  primitives.  A 
clipping  algorithm  which  is  so  general  is  itself  a complicated 
procedure  and  may  be  computationally  expensive.  For  general 
object  primitives  there  is  no  solution  available,  short  of  a 
variation  of  the  ray  tracing  algorithm  itself.  Hence  the 
search  for  such  a panacea  is  likely  to  be  fruitless. 


26 


If  we  restrict  the  primitives  to  objects  having  polygonal 
facets  only,  the  Weiler-Atherthon  algorithm  will  work  well 
[Wei  77].  This  algorithm  will  determine  the  visible  parts  of 
all  objects  defined  by  finite  polygonal  or  infinite  planar 
facets.  The  procedure  is  complex  and  computationally 
expensive . 

However,  a much  simpler  approach  is  more  promising.  We 
note  that  the  visibility  of  the  objects  will  be  determined  by 
the  ray  tracing  process  itself;  we  merely  would  like  to  reduce 
the  number  of  elements  outside  of  the  viewing  frustrum  that 
need  to  be  tested.  We  simply  discard  entire  objects  which 
have  all  of  their  vertices  outside  of  the  viewing  image.  This 
requires  a minor  modification:  each  test  for 
top/left/bottom/right  outside  of  the  view  is  applied  in  the 
view  coordinates  to  the  lowest  top/right-most  left/highest 
bottom/left-most  right  vertex  of  each  object.  These  tests  can 
be  applied  when  the  objects  are  placed  into  the  view 
coordinate  system:  for  each  object  the  indicated  tests  are 
performed  at  the  end  of  each  object's  coordinate 
transformation. 

Thus  there  is  a relatively  simple  test  to  reduce  the 
number  of  candidate  objects  for  the  ray  intersection 
calculations.  This  step  alone  can  noticeably  reduce  the  total 
amount  of  required  calculations.  We  assume  that  such  a 
preliminary  pruning  is  done  on  the  original  object  list,  and 
that  the  list  of  objects  has  already  been  reduced. 
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2.5.2  Simplified  Sorting 

The  next  problem  is  to  sort  objects  in  a way  that  avoids 
many  comparisons.  A given  object  may  be  nowhere  in  the 
direction  of  the  ray  being  used  for  construction  of  a pixel. 
Finding  these  "inactive"  objects  with  a computationally 
efficient  process  is  highly  desirable. 

2.5.3  Parallelization 

One  of  the  easiest  concepts  for  speeding  up  the  ray 
tracing  calculation  is  to  realize  that  physically  every  ray  is 
independent  of  the  other  rays;  thus  each  ray  can  be  assigned 
to  a separate  processor.  Then  the  total  image  generation  time 
is  merely  the  time  taken  by  the  most  involved  ray  calculation; 
i.e.  tracing  of  the  most  complex  ray.  For  an  "ordinary"  VGA 
image  one  could  use  some  300,000  processors,  assuming  that  few 
secondary  rays  need  be  calculated.  For  more  complex  scenes  on 
workstations,  over  a million  processors  could  be  employed. 
There  is  also  the  problem  of  defining  the  image  for  each 
processor's  use  (by  a data  broadcasting  process),  and 
assembling  the  results  into  a single  image  buffer  (so  that 
calculation  of  the  next  scene  can  be  started  while  the  old 
image  is  displayed. ) 

This  parallelization  would  require  each  processor  to 
calculate  the  ray  path  for  each  pixel  AND  would  ignore  any 
information  the  neighboring  processors  may  have  about  the 
image  structure,  i.e.  image  coherence  is  ignored.  Thus  the 
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process  may  be  the  fastest,  but  it  also  the  most  wasteful  of 
computing  resources. 

A reasonable  solution  is  to  map  the  ray  tracing  process 
to  the  engines  that  comprise  the  computing  environment.  This 
dissertation  merely  notes  this,  and  will  not  attempt  to  delve 
into  it.  We  will  concentrate  on  finding  speedups  in  the 
uniprocessor  solution,  and  leave  the  mapping  of  the  improved 
process  as  a desirable  extension  for  further  work.  Work  in 
defining  image  composition  hardware  has  been  completed  [Fie 
88];  these  together  with  the  structures  defined  in  this 
dissertation  point  toward  a class  of  image  generation 
machinery  for  calculating  rich  visual  environments  for  use  in 
dynamic  artificial  reality  scenes. 

2 . 6 Summary 

This  chapter  reviewed  the  parameters  involved  in  human 
vision  and  the  ideas  underlying  the  basic  ray  tracing  process. 
It  showed  that  reasonable  computing  machinery  and  existing 
affordable  display  technology  can  present  but  an  approximation 
to  a naturally  observed  scene.  The  image  quality  is  limited 
by  resolution  of  the  display  device  and  by  the  computational 
capabilities  of  the  machinery  that  may  be  used  to  produce  the 
actual  image.  However,  within  these  limitations,  images  can 
be  produced  that  present  a visually  stunning  image.  It  is  the 
intent  of  this  dissertation  to  examine  one  of  the 
computational  bottlenecks  in  the  calculation  of  computer- 
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produced  images:  that  of  decreasing  the  number  of  objects  that 
need  to  be  considered  during  the  calculation  of  visible  parts 
of  a scene.  In  achieving  this  goal  we  also  specify  design 
parameters  for  simple  computational  structures  which  can  be 
cast  in  VLSI  hardware.  We  will  verify  the  correctness  of 
these  designs  and  use  simulation  to  estimate  their  impact  on 
the  performance  of  the  basic  ray  tracing  procedure. 


CHAPTER  3 

FAST  RAY  TRACING  ALGORITHMS 
3.1  Introduction 

The  basic  philosophy  of  ray  tracing  is  that  an  observer 
sees  a point  on  a display  surface  as  a result  of  the 
interaction  of  the  surface  at  that  point  with  rays  emanating 
from  the  scene.  In  general  a light  ray  may  reach  the  surface 
indirectly  via  reflection  at  other  surfaces,  transmission 
through  partially  transparent  objects  or  a combination  of 
these.  Ray  tracing  is  the  most  complete  simulation  of  an 
illumination/reflection  model  in  computer  graphics.  Ray 
tracing  procedures  have  produced  the  most  realistic  images  to 
date  in  computer  graphics . 

But  the  method  has  a number  of  disadvantages,  the  most 
critical  being  extremely  high  processing  requirements.  Image 
generation  time  can  be  hours  or  even  days,  even  on  powerful 
computer  systems.  Ray  tracing  is  considered  impractical. 
Much  research  has  been  done  to  improve  the  procedure.  There 
are  several  techniques  that  have  gained  wide  acceptance.  In 
this  chapter  we  will  discuss  four  algorithms  and  point  out  the 
problems  of  each  of  them. 
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Faced  with  the  task  of  accelerating  the  process  of  ray 
tracing,  there  are  three  very  distinct  strategies  to  consider 
[Arv  89 ] : 

1)  reducing  the  average  cost  of  finding  the  intersection 
of  a ray  with  the  environment; 

2)  reducing  the  total  number  of  rays  intersected  with  the 
environment; 

3)  replacing  individual  rays  with  a more  general  entity. 

These  strategies  are  called  "faster  intersections", 
"fewer  rays"  and  "generalized  rays",  respectively.  The  last 
strategy  (generalized  rays)  places  some  constraint  on  the 
environment  that  can  be  considered,  such  as  restricting  the 
type  of  primitive  objects,  or  we  may  need  to  abandon  the 
notion  of  exact  intersection  calculations,  accepting  an 
approximation  instead  [Arv  87].  The  examples  of  this  are  beam 
tracing,  cone  tracing,  pencil  tracing  [Fol  90] [Arv  89].  These 
procedures  are  peripheral  to  the  study  being  presented  here 
and  will  not  be  discussed  further. 

The  other  two  strategies,  faster  intersections  and  fewer 
rays,  can  be  combined  under  the  heading  of  speed  up 
algorithms.  In  this  case  obtaining  faster  ray-intersection 
with  objects  is  still  the  key  to  the  fast  algorithm.  The 
strategy  used  to  get  faster  intersections  separates  this 
class  into  the  subcategories  of  "faster"  and  "fewer"  ray- 
object  intersections.  The  former  consists  of  efficient 
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algorithms  for  intersecting  rays  with  specific  primitive 
objects,  while  the  latter  addresses  the  larger  problem  of 
intersecting  a ray  with  an  environment  using  a minimum  of  ray- 
object  intersection  tests. 

The  following  four  algorithms  are  representative  fast 
algorithms  which  can  be  placed  into  the  above  two 
subcategories : 

1)  Bounding  volume  algorithms; 

2)  Hierarchical  bounding  volume  algorithms; 

3)  Uniform  spatial  subdivision  algorithms; 

4)  Non-uniform  spatial  subdivision  algorithms. 

3.2  Bounding  Volume  Algorithms 

The  most  fundamental  and  ubiquitous  tool  for  ray  tracing 
acceleration  is  the  bounding  volume.  The  idea  of  using  a 
bounding  volume  takes  advantage  of  the  fact  that  rays  usually 
miss  the  objects  they  are  tested  against.  By  enclosing 
complicated  objects  in  invisible  bounding  volumes  that  are 
easy  to  intersect,  one  can  avoid  complicated  object 
intersection  calculations.  This  leads  to  fewer  calculations, 
especially  if  the  ray  misses  the  bounding  volume.  Because  the 
bounding  volume  intersection  test  is  simpler  than  object 
intersection,  this  algorithm  may  reduce  the  computation  by  a 
considerable  amount,  but  can  not  improve  upon  the  linear  time 
complexity  of  exhaustive  ray  tracing  [Arv  89].  The  reason  can 
be  explained  with  the  objects  in  Figure  3.1.  How  do  we  bound 
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Figure  3 . 1 Bounding  Volumes 
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these  objects?  How  many  bounding  volumes  are  employed?  What 
kinds  of  shapes  are  best  as  a bounding  volumes?  The  simple 
bounding  volume  could  be  sphere  (circle  in  this  two- 
dimensional  example)  because  the  intersection  test  is  easier 
than  that  of  any  other  type.  When  we  employ  one  sphere 
bounding  volume,  the  bounding  volume  is  too  huge,  and  every 
ray  will  hit  the  bounding  volume.  In  this  case  the  bounding 
volume  algorithm  must  pay  the  cost  of  three  object 
intersection  tests  and  costs  the  of  bounding  volume 
intersection  tests.  When  we  employ  three  bounding  volumes  on 
each  object,  the  bounding  volumes  for  B,C  have  similar 
problems  as  the  one  bounding  volume  case.  Even  though  empty 
space  is  reduced  by  employing  three  bounding  volumes,  volumes 
2,3  still  contain  big  empty  spaces. 

The  other  solution  is  to  employ  a box  as  a bounding 
volume  (in  the  two-dimensional  case  a rectangle).  By  employing 
fitted  bounding  volumes  we  reduce  the  empty  space  in  each 
bounding  volume.  The  problem  in  this  case  happens  with  object 
C.  The  ray  intersection  test  for  object  C requires  three  ray 
and  line  intersection  calculations.  But  the  ray  intersection 
test  for  the  bounding  volume  requires  four  ray  and  line 
intersection  tests.  There  is  no  advantage  if  the  cost  for 
intersection  with  the  bounding  volume  is  as  expensive  as  that 
with  the  object.  Weghorst  points  out  this  as  the  key  problem 
of  the  bounding  volume  algorithm  [Weg  84].  The  void  area  of 
bounding  volume  is  defined  as  the  difference  in  the  projected 
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areas  of  bounding  volume  and  the  item.  He  considers  the 
following  two  problems  when  bounding  volumes  are  employed: 

1.  void  area  of  bounding  volume; 

2.  the  cost  of  intersecting  the  bounding  volumes. 

In  his  computer  experiment,  Weghorst  applied  two  bounding 
volumes  (sphere,  box)  to  the  same  object  for  deciding  which 
one  is  better.  He  considered  the  case  of  only  changing  the 
view  point  and  he  measured  two  performance  indices  for  each 
type  of  bounding  volume.  From  his  experiments  he  concluded 
that  the  optimal  choice  of  bounding  volume  is  ray  dependent. 

However,  in  the  general  case,  as  the  illumination  becomes 
more  complex  or  the  environment  becomes  specular  or 
transparent,  ray  dependency  becomes  more  difficult  to  consider 
since  rays  are  more  likely  to  arrive  from  any  direction. 

The  best  bounding  volume  depends  on  both  the  expense  of 
performing  tests  on  the  bounding  volume  itself  and  on  how  well 
the  volume  protects  the  enclosed  object  from  tests  that  do  not 
yield  an  intersection.  The  criterion  for  the  selection  of  the 
bounding  volume  is  to  minimize  the  total  cost  function  T of 
the  intersection  test  for  an  object.  This  total  cost  function 
is  defined  by  Weghorst,  Hooper,  and  Greenberg  as  follows  [Weg 
84] 

T = b * B + i * I 

where 

T : total  cost  function 

b : number  of  times  that  the  bounding  volume  is  tested  for 
intersection 
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B : the  cost  of  testing  the  bounding  volume  for 
intersection 

i : the  number  of  times  that  the  item  is  tested  for 
intersection 

I : cost  of  testing  the  item  for  intersection. 

For  a specific  item  in  an  environment,  with  a given  view, 
b and  I are  constant.  However  by  manipulating  the  shape  and 
size  of  the  bounding  volume,  B and  i can  be  varied  to  reduce 
the  total  cost  function.  The  two  elements  B and  i are 
generally  interdependent.  For  example,  reducing  B by  reducing 
the  complexity  of  the  bounding  volume  will  almost  certainly 
increase  i.  To  minimize  the  total  cost  function,  we  only 
assign  bounding  volumes  to  those  items  whose  intersection 
tests  are  sufficiently  complex  to  warrant  one.  Certain  items 
such  as  spheres,  cylinders,  and  rectangular  parallelepipeds, 
need  not  to  be  bounded.  In  this  case  the  ray  tracing  algorithm 
maintains  two  lists  (one  is  the  bounding  volume  list,  the 
other  the  simple  object  list). 

Bounding  volume  algorithm  can  be  added  to  the  original 
ray  tracing  algorithm.  Figure  3.2  shows  the  bounding  volume 
algorithm.  This  algorithm  employs  a bounding  volume  list  and 
performs  intersection  tests  with  the  objects  after  finishing 
ray-bounding  volume  intersection  tests.  From  a theoretical 
point  of  view  the  bounding  volume  algorithm  may  reduce  the 
computation  by  a constant  factor,  but  can  not  improve  upon 
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procedure  Ray_trace  (start,  direction,  depth,  color) 
vector  : start,  direction 
integer  : depth 
colors  : color 

begin 

vector  : intersection  point,  ref lected_direction , 
transmitted_direction 

colors  : local_color,  ref lected_color , transmitted_color 

color  <-  black  ; 
if  (depth  ^ maxdepth) 

begin 

color  <-  back_ground_color; 
while  (bounding  volume) 

[intersect  ray  with  all  objects  in  a bounding  volume 
and  find  intersection  point  (if  any)  that  is  closest 
to  start  of  ray] 

bounding  volume  <-  next  bounding  volume; 

endwhile 

if  (intersection) 

begin 

local_color  [ contribution  of  local  color  model  at 

intersection  point  ] 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( intersection_point , ref lected_direction, 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( inter sect ion_point , transmitted_direction , 
depth+l , transmitted_color ) 

Combine  (color,  local_color , local_weight_for_surf ace, 
ref lected_color , ref lected_weight_f or_sur face, 
transmitted_color , transmitted_weight_f or_surf ace ) 

end 

end 

end 


Figure  3.2  Bounding  Volume  Algorithm 
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the  linear  time  complexity  of  exhaustive  ray  tracing.  The 
main  problem  of  bounding  volumes  is  that  defining  the 
optimal  bounding  volume  is  difficult. 

3.3  Hierarchical  Bounding  Volume  Algorithm 

A common  extension  to  bounding  volumes  is  an  attempt  to 
impose  a hierarchical  structure  of  bounding  volumes  on  the 
scene . 

If  it  is  possible,  objects  in  close  spatial  proximity  are 
allowed  to  form  clusters,  and  the  clusters  are  themselves 
enclosed  in  bounding  volumes . Figure  3 . 3 shows  a bounding 
volume  A which  contains  one  large  object  B and  an  other 
bounding  volume  C,  which  has  four  small  bounding  volumes  (Cl, 
C2,  C3,  C4)  inside  it.  The  tree  represents  the  hierarchical 
relationship  between  the  seven  boundary  extents  A,  B,  C,  Cl, 
C2,  C3,  C4 . By  enclosing  a number  of  bounding  volumes  within 
a larger  bounding  volume  it  is  possible  to  eliminate  many 
objects  from  further  consideration  with  a single  intersection 
check.  If  a ray  did  not  intersect  the  parent  volume,  there  was 
no  need  to  test  it  against  the  bounding  volumes  or  objects 
contained  within.  A ray  traced  against  bounding  volumes  means 
that  such  a tree  is  traversed  from  the  top-most  level. 

A ray  that  happened  to  intersect  Cl  in  Figure  3.3  would 
be  tested  against  the  bounding  volume  Cl,  C2,  C3  and  C4  but 
only  because  it  intersects  the  bounding  volume  representing 
that  cluster.  A ray  that  missed  bounding  volume  A need  not  be 
tested  against  bounding  volumes  inside  of  A.  This  intersection 
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Bounding  Volume  Tree  Structure 

A 


Figure  3.3  Bounding  Volume  Hierarchies 
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algorithm  is  implemented  in  Figure  3.4.  The  hierarchical 
bounding  volume  algorithm  employs  an  intersection  procedure 
(HBV_intersect ) . The  data  structure  for  this  hierarchy  is 
assumed  to  be  a tree  with  an  arbitrary  branching  factor  at 
each  internal  node.  Thus  bounding  volumes  may  enclose  any 
number  of  other  bounding  volumes . Each  leaf  node  of  the  tree 
is  a single  primitive  object  while  each  interior  node  consists 
of  a bounding  volume.  The  procedure  Intersect  in  HBV_intersect 
performs  a ray-object  intersection  for  the  given  ray 
information  (origin,  direction)  and  an  object.  The  function 
" Intersect_P"  is  very  similar  to  "intersect"  except  that  it 
returns  a boolean  value  indication  whether  an  intersection  was 
found  or  not.  The  intersection  process  of  the  hierarchical 
bounding  volume  begins  with  the  root  node  of  the  tree. 

To  alleviate  the  bounding  volume  problem,  Rubin  and 
Whitted  [Rub  80]  introduce  the  hierarchical  bounding  volumes 
algorithm  to  ray  tracing  in  order  to  attain  a theoretical  time 
complexity  which  is  logarithmic  in  the  number  of  objects 
instead  of  being  linear.  But  constructing  a bounding  volume 
hierarchy  involves  two  considerations: 

1)  which  bounding  volumes  to  enclose; 

2)  what  type  of  bounding  volume  to  enclose  them  with. 

This  is  a challenging  problem  because  the  number  of 
possible  hierarchical  groupings  of  objects  grows  exponentially 
with  the  number  of  objects,  making  an  exhaustive  search 
totally  impractical.  There  are  some  suggestions  on 
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procedure  Ray_trace  (start,  direction,  depth,  color) 
vector  : start,  direction 
integer  : depth 
colors  : color 

begin 

vector  : inter section_point,  ref lected_direction, 
transmitted_direction 

colors  : local_color,  ref lected_color , transmitted_color 

color  <-  black  ; 
if  (depth  ^ maxdepth) 

begin 

color  <-  back_ground_color ; 

HBV_intersect ( start , direction , node ) ; 
if  (intersection) 

begin 

local_color  ^ [ contribution  of  local  color  model  at 

intersection  point  ] 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( intersection_point , ref lected_direction, 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( inter section_point , transmitted_direction , 
depth+1 , transmitted_color ) 

Combine  (color,  local_color , local_weight_for_surf ace, 
ref lected_color, ref lected_weight_for_sur face, 
transmitted_color , transmitted_weight_for_surf ace ) 

end 

end 

end 

procedure  HBV_intersect  (origin,  direction,  node) 
vector  : origin,  direction 

globalpointer  : *node 

begin 

if  node  is  a leap  then 

Intersect  (origin, direct ion, node. object) ; 
elseif  Inter sect_P ( origin , direction , node . bounding_volume ) 

then 

for  each  child  of  node  do 

HBV_intersect (origin, direction, child) ; 

endfor 

endif 

end 


Figure  3.4  Hierarchical  Bounding  Volume  Algorithm 
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constructing  the  hierarchy  of  bounding  volumes  [Gol  80]  [Rub 
80]  [Weg  84] . 

The  potential  clustering  and  the  depth  of  hierarchy 
depends  on  the  nature  of  the  scene.  The  problem  with  bounding 
volume  hierarchies  is  that  they  are  not  convenient  for  a user 
to  specify.  That  drawback  is  addressed  by  techniques  for 
generating  bounding  volume  hierarchies  automatically  [Nic  94]  . 

3.4  Nonuniform  Spatial  Subdivision  Algorithm 

Bounding  volume  hierarchies  provide  a means  of 
recursively  narrowing  the  focus  of  the  search  to  more 
promising  candidates  for  intersection.  Bounding  volume 
hierarchies  organize  objects  bottom-up;  in  contrast  spatial 
subdivision  algorithms  (uniform  or  nonuniform  case)  begin  with 
a different  philosophy.  Spatial  partitioning  subdivides  space 
top-down,  i.e,  we  rely  on  simple  volumes  to  identify  objects 
which  are  good  candidates  for  intersection,  but  these  simple 
volumes  are  constructed  by  applying  a divide-and-conquer 
technique  to  the  space  surrounding  the  objects  instead  of 
considering  the  objects  themselves.  One  may  construct  the 
volumes  in  a top-down  fashion  by  partitioning  a volume 
bounding  the  environment  into  smaller  pieces.  The  smaller 
volumes  are  assigned  a collection  of  objects  which  are  totally 
or  partially  contained  within  them.  The  spatial  subdivision 
algorithm  selects  sets  of  objects  based  on  given  volumes.  This 
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small  volume  is  an  axis-aligned  rectangular  prism.  This  is 
called  a "voxel".  A preprocessing  step  is  responsible  for 
constructing  non-overlapping  voxels. 

The  basic  idea  of  the  spatial  subdivision  algorithm  is 
that  a ray  imposes  a strict  ordering  on  the  pierced  voxels 
based  on  the  distance  to  the  point  at  which  the  ray  first 
enters  each  voxel.  Because  the  voxels  are  closest  to  the  ray 
origin  than  those  in  all  subsequent  voxels,  if  we  process  the 
voxels  in  the  order  in  which  they  are  encountered  along  the 
ray,  we  need  not  consider  the  contents  of  any  further  voxels 
once  we  have  found  a point  of  intersection. 

There  are  two  types  of  spatial  subdivision  schemes: 
uniform  spatial  subdivision  and  nonuniform  spatial 
subdivision.  Nonuniform  spatial  subdivision  techniques 
discretize  space  into  regions  of  varying  size  in  order  to 
conform  to  features  of  the  environment.  This  variation  in  size 
allows  more  subdivisions  to  be  formed  in  densely  populated 
regions  of  space  and  it  allows  large  voxels  to  cover  regions 
which  are  sparsely  populated  or  are  entirely  void. 

Usually  an  octree  is  one  possible  data  structure  for 
creating  and  organizing  such  a collection  of  voxels.  Glassner 
[Gla  84]  introduces  an  octree  variation  for  use  in  ray 
tracing.  In  the  creation  of  the  octree,  a box  containing  the 
environment  is  recursively  subdivided  until  each  voxel 
contains  fewer  than  some  threshold  number  of  intersection 
candidates  or  until  a storage  limitation  is  reached.  After 
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constructing  the  octree,  we  trace  rays  through  the  algorithm 
in  Figure  3.5.  In  Glassner's  approach,  nodes  of  the  octree  are 
linked  and  accessed  by  uniquely  defined  names  rather  than 
sorting  explicit  pointers  to  descendent  nodes.  To  access  data 
associated  with  a node  name,  the  name  is  used  to  retrieve  a 
pointer  from  a hash  table.  Glassner  observed  that  simply 
computing  the  name  modulo  the  size  of  the  hash  table  serves  as 
a good  hashing  function.  If  a ray  hits  nothing  within  a voxel 
we  must  proceed  to  the  next  voxel  pierced  by  that  ray. 
Glassner' s algorithm  accomplishes  this  task  by  keeping  the 
minimum  length  of  voxels  (the  resolution  of  voxel)  in  nodes  of 
the  octree.  The  movement  to  the  next  voxel  is  accomplished  by 
finding  a point  within  the  next  voxel  and  performing  the 
lookup . 

An  other  type  of  data  structure  for  creating  and 
organizing  such  a collection  of  voxels  is  suggested  by  Kaplan 
and  Jansen  uses  binary  space  partitioning  trees  (BSP  trees). 
This  BSP  trees  obviates  the  need  for  voxel  names  and  hashing 
at  the  expense  of  a potential  increase  in  storage.  Figure  3.6 
shows  a spatial  subdivision  algorithm  based  on  BSP  trees.  This 
algorithm  is  suggested  by  Jansen  [Arv  89]. 

The  big  difference  between  Glassner' s and  Jansen's 
approaches  are  the  data  structure  for  voxels  and  the  movement 
to  the  next  voxel.  Instead  of  finding  the  next  voxel  by 
creating  a point  guaranteed  to  fall  within  it  and  traversing 
the  hierarchical  structure  from  the  root,  Jansen's  algorithm 
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procedure  Ray_trace  (start,  direction,  depth,  color) 
vector  : start,  direction 
integer  : depth 
colors  : color 

begin 

vector  : inter section_point,  ref lected_direction, 
transmitted_direction 

colors  : local_color,  ref lected_color , transmitted_color 

color  ^ black  ; 
if  (depth  ^ maxdepth) 

begin 

color  <-  back_ground_color ; 

Octree_inter sect ( start , direction ) ; 
if  (intersection) 

begin 

local_color  <-  [ contribution  of  local  color  model  at 

intersection  point  ] 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( intersection_point , ref lected_direction, 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( inter section_point , transmitted_direction , 
depth+1 , transmitted_color ) 

Combine  (color,  local_color , local_weight_for_surf ace, 
ref lected_color , ref lected_weight_for_surf ace, 
transmitted_color , transmitted_weight_f or_surf ace ) 

end 

end 

end 

procedure  Octree_intersect  (origin,  direction) 
vector  : origin,  direction 

begin 

vector  : Q 
Q <-  origin; 

repeat 

[ locate  the  voxel  which  contains  Q ] 
for  each  object  in  the  voxel  do 

Intersect ( origin, direction , object ) ; 

endfor 

if  (no  intersection) 

Q «-  a point  in  the  next  voxel  pierced  by  ray; 

endif 

until  an  intersection  is  found  or  Q is  outside  the 
environment 


end 


Figure  3.5  Non-uniform  Spatial  Subdivision  Algorithm 

(Octree) 
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procedure  Ray_trace  (start,  direction,  depth,  color) 

vector  : start,  direction 

integer  : depth 

colors  : color 

begin 

vector  : inter section_point,  ref lected_direction, 
transmitted_direction 

colors  : local_color,  ref lected_color , transmitted_color 
color  <-  black  ; 
if  (depth  ^ maxdepth) 

begin 

color  <-  back_ground_color ; 

BSP_inter sect ( start , direction , node ) ; 
if  (intersection) 

begin 

local_color  <-  [ contribution  of  local  color  model  at 

intersection  point  ] 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( inter section_point , ref lected_direct ion , 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( inter sect ion_point , transmitted_dir ect ion , 
depth+1 , transmitted_color ) 

Combine  (color,  local_color, local_weight_for_surface, 
ref lected_color, ref lected_weight_for_sur face, 
transmitted_color , transmitted_weight_f or_surf ace ) 

end 

end 

end 

procedure  BSP_intersect  (origin,  direction,  node) 
vector  : origin,  direction 

globalpointer  : *node 
begin 

if  ray_interval  is  empty  or  node  is  nil  then  return 
if  node  is  a leaf  then 

for  each  object  in  the  node  do 

Intersect (origin, direction, object ) ; 

endfor 

else 

near  •+  ray  clipped  to  near  side  of  node  .partition; 
BSP_intersect  (near .origin,  near .direction, 

pointer  to  near  half  space); 
if  (no  intersection) 

far  + ray  clipped  to  far  side  of  node. partition; 
BSP_intersect  (far. origin,  far .direction, 

pointer  to  far  half  space); 

endif 

endif 

end 

Figure  3.6  Non-uniform  Spatial  Subdivision  Algorithm  (BSP) 
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recursively  descends  all  the  branches  of  the  BSP  tree  which 
terminate  at  pierced  voxels,  making  use  of  each  partition  node 
only  once  per  ray. 

Figure  3.7  shows  the  nonuniform  spatial  subdivision 
algorithm  via  an  octree.  The  ray  A shown  here  visits  five  of 
the  voxels  to  examine  the  objects  in  those  five  voxels.  Three 
of  the  eight  objects  need  to  be  tested  for  intersection.  The 
ray  B visits  only  one  of  the  voxels  and  performs  one  ray- 
object  intersection  test  which  is  tested  eight  times  in  the 
original  ray  tracing  algorithm.  Finer  subdivision  can  decrease 
the  number  of  ray-object  intersection  tests  at  the  expense  of 
additional  voxel  processing  overhead.  This  algorithm  requires 
enormous  amounts  of  data  storage  [Wat  89]. 

3.5  Uniform  Spatial  Subdivision  Algorithm 

Fujimoto  [Fuj  86]  introduced  an  uniform  spatial 
subdivision  algorithm  in  which  voxels  of  uniform  size  are 
organized  in  a regular  three-dimensional  grid.  The  overall 
strategy  is  quite  similar  to  the  nonuniform  spatial 
subdivision  algorithm.  The  voxels  are  processed  in  the  order 
they  are  pierced.  When  each  voxel  is  tested,  candidate  objects 
in  the  voxel  are  intersected  with  the  ray.  To  perform  this, 
Fujimoto  developed  a three-dimensional  digital  difference 
analyzer  (3DDDA)  to  incrementally  compute  successive  voxel 
indices  in  the  same  way  that  efficient  line  rasterization 
algorithms  incrementally  compute  pixel  coordinates.  This  is 


TO  : Tested  Object 
PV  : Processed  Voxels 


Figure  3.7  Non-uniform  Spatial  Subdivision 
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similar  to  the  line  drawing  algorithm  [Fol  90].  This  3DDDA 
eliminates  floating  point  multiplications  and  divisions. 
Figure  3.8  shows  this  algorithm. 

The  differences  between  uniform  spatial  subdivision 
algorithm  and  nonuniform  spatial  subdivision  algorithm  are  the 
following: 

1)  the  subdivision  strategy  does  not  depend  on  the  structure 
of  the  environment; 

2)  access  to  the  ray  pierced  voxels  are  very  fast  due  to  the 
incremental  calculations. 

The  Figure  3.9  shows  a 2-D  analog  of  the  uniform  spatial 
subdivisions.  The  ray  A visits  14  voxels  and  results  in  one 
object  being  tested  for  ray  intersection.  But  there  are  many 
empty  voxels  in  this  example.  Since  this  algorithm  does  not 
depend  on  the  structure  of  the  environment,  there  are  many 
voxels  which  point  nothing.  This  kind  of  disadvantage  can  not 
overcome  the  advantage  of  fast  access.  The  big  disadvantage  is 
that  a huge  memory  may  be  required.  Although  paged  memory 
techniques  can  be  used  to  implement  the  scheme,  there  is  a 
large  memory  management  overhead  in  paging  and  many  modest 
images  can  not  be  handled  expeditiously. 

3.6  Inside  Test 

Whatever  kind  of  fast  algorithm  is  used  in  ray  tracing, 
it  is  vitally  important  that  the  ray-object  intersections  must 
be  correct.  The  big  advantage  of  spatial  subdivision  algorithm 
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procedure  Ray_trace  (start,  direction,  depth,  color) 
vector  : start,  direction 
integer  : depth 
colors  : color 

begin 

vector  : intersection_point , ref lected_direction, 
transmitted_direction 

colors  : local_color,  ref lected_color , transmitted_color 
color  ^ black  ; ~ 

if  (depth  ^ maxdepth) 

begin 

color  ^ back_ground_color; 

Grid_intersect( start, direction, node) ; 
if  (intersection) 

begin 

local_color  <-  [ contribution  of  local  color  model  at 

intersection  point  ] 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( intersection_point , ref lected_direction, 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( intersection_point , transmitted_direction , 
depth+1 , transmitted_color ) 

Combine  (color,  local_color, local_weight_for_surface, 
ref lected_color , ref lected_weight_for_surface, 
transmitted_color , transmitted_weight_for_surf ace) 

end 

end 

end 

procedure  Grid_intersect  (origin,  direction,  node) 
vector  : origin,  direction 

globalpointer  : *node 

begin 

[compute  i,j,k  for  the  voxel  containing  origin]; 

[set  up  3DDDA  based  on  direction  and  origin] 

repeat 

for  each  object  in  voxel[i,j,k]  do 

Intersect (origin,  direction,  object); 

endfor 

if  (no  intersection)  then 
update  i,j,k  using  3DDDA; 

endif 

until  an  intersection  is  found  or  outside  of  environment 

end 


Figure  3.8  Uniform  Spatial  Subdivision  Algorithm 


Circle  : Object 


Figure  3.9  Uniform  Spatial  Subdivision 
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is  that  sorting  the  ray-pierced  voxels  is  included  in  the 
algorithm.  But  the  bounding  volume  algorithm  does  not  include 
the  sorting  procedure.  Even  though  we  can  sort  bounding 
volumes,  sorting  order  of  bounding  volumes  does  not  mean  the 
sorting  order  of  objects.  For  example  ray  A pierced  two 
bounding  volumes  BVl,  BV2  in  Figure  3.10  (a).  The  sorting 

order  for  bounding  volumes  with  respect  to  ray  A is  (BVl, 
BV2).  But  the  sorting  order  for  objects  is  different  than  that 
for  bounding  volumes.  There  is  ray-object  intersection  in  BVl. 
But  that  intersection  is  not  the  correct  one  with  ray  A.  The 
reason  is  that  bounding  is  performed  on  objects  not  on  space. 
This  is  a problem  of  the  bounding  volume  algorithm  (including 
hierarchical  bounding  volumes). 

Spatial  subdivision  algorithm  has  a different  kind  of 
problem.  The  ray  B in  Figure  3.10  (b)  pierces  the  voxels  in 
the  following  order:  voxel  4,  voxel  3,  voxel  7,  voxel  6,  voxel 
5.  Since  object  OBJ3  is  in  voxel  4,  the  ray-object 
intersection  test  is  performed  between  the  ray  and  OB J3 . But 
the  intersection  point  is  not  in  voxel  4 . 

The  inside  test  makes  sure  that  an  intersection  point  is 
in  a particular  voxel  in  the  spatial  subdivision  algorithm. 
The  inside  test  procedure  tests  the  following  after  a ray- 
object  intersection  has  been  detected: 

1.  find  the  intersection  point  using  the  intersection 
parameter  value(t)  and  the  ray  equation; 
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Figure  3.10  Inside  Test 
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2.  check  whether  the  intersection  point  is  in  the  x,y,z 
interval  of  the  voxel; 

3.  make  sure  that  the  intersection  point  is  in  the  voxel. 

To  get  a fast  ray  tracing  algorithm,  we  must  find  a fast 
and  easily  computed  inside  test  algorithm.  To  save  memory  in 
the  spatial  subdivision  algorithm,  we  can  combine  the  bounding 
volume  algorithm  and  the  spatial  subdivision  algorithm.  In 
this  case  bounding  is  performed  on  sub-spaces  rather  than 
objects  and  the  inside  test  must  be  done  on  every  intersection 
point.  This  new  algorithm  will  be  developed  in  chapter  4. 


CHAPTER  4 
DEPTH  SORTER 


4.1  Introduction 

The  Depth  Sorter  discussed  in  this  chapter  is  a procedure 
for  substantially  speeding  up  the  ray  tracing  calculations. 
The  core  of  this  algorithm  consists  of  two  ideas: 

1)  sort  bounding  volumes, 

2)  avoid  unnecessary  intersection  tests  even  for  objects  in 
the  bounding  volumes  intersected  with  the  ray. 

This  algorithm  is  modeled  after  other  fast  algorithms  and 
avoids  the  drawbacks  inherent  in  those.  This  chapter  shows 
the  development  of  the  fast  algorithm.  We  discuss  some  of  the 
problems  in  the  fast  algorithm.  Data  structure  is  also  an 
important  factor  to  consider  to  save  memory  space  or  when  the 
scene  contains  a great  many  objects.  A critical  factor  in 
computational  efficiency  of  ray  tracing  is  the  ease  with  which 
object  bounding  can  be  accomplished.  After  objects  are 
assigned  to  bounding  volumes,  the  sorting  of  those  volumes 
with  respect  to  the  ray  is  a key  problem. 

The  main  sorting  idea  and  aspects  of  its  hardware 
implementation  are  considered.  The  following  subsections 
discuss  these  things  step  by  step. 
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4.2  General  Depth  Comparator 

The  ray  tracing  algorithm  finds  the  object  nearest  to  the 
ray  origin.  That  algorithm  performs  intersection  calculation 
first.  After  a ray  meets  an  object,  the  depth  of  the 
intersection  with  the  previously  found  depth  is  compared. 
Repeating  this  step  for  every  object,  the  comparator  finds 
the  nearest  object  at  each  pixel.  We  can  interpret  the  ray 
tracing  algorithm  as  a depth  comparator  for  objects  in  image 
space.  If  we  want  to  find  the  nearest  object  not  using 
bounding  or  spatial  subdivision  algorithm,  the  ray  tracing 
algorithm  itself  is  sufficient  for  calculating  the  depth  of 
each  visible  element  in  the  image.  The  amount  of  calculation, 
and  hence  the  execution  time,  is  basically  fixed  by  the 
complexity  of  the  scene.  By  adding  hardware,  we  can  reduce 
the  run  time.  According  to  statistics  reported  by  Whitted  [Whi 
80],  95%  of  ray  tracing  time  is  spent  for  intersection 
calculations.  To  improve  this,  the  bottleneck  stage  in  ray 
tracing,  we  could  use  hardware  that  performs  the  intersection 
test.  Let's  consider  some  of  the  problems  in  implementing  this 
kind  of  hardware. 

The  typical  object  space  for  ray  tracing  is  made  up  of 
two  types  of  surfaces,  flat  surface  elements,  and  quadratic 
surfaces.  For  both  of  these  types  the  surface  equations  and 
the  surface  normal  equations  are  relatively  simple  [Han  89]. 

It  may  be  possible  to  devise  schemes  for  depth  sorting 
using  a wide  range  of  object  primitives.  But  it  is  not  easy 
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when  we  consider  the  following  example  as  shown  in  Figure  4.1. 
Let  us  consider  a general  polygonal  surface  element.  Since  the 
element  may  be  concave,  the  star  shown  in  Figure  4.1 
represents  the  general  case. 

Note  that  the  pertinent  eguations  are: 

f^(x,y,z)  =aiX+b^y+c^z=Q  (i=i,2,..,5)  (4.0) 

To  find  the  ray/object  intersection  we  must 

1.  Calculate  the  intersection  of  the  ray  with  the  surface,  as 
a function  of  the  ray  parameter  t; 

2.  Find  the  intersection  point  using  t and  the  ray  equation; 

3.  Check  the  following  conditions 

( fi  ^ 0,  f2  ^ 0,  fj  > 0)  OR 

( f2  ^ 0,  fj  ^ 0,  f4  > 0)  OR 

( f2  ^ 0,  f4  ^ 0,  fs  ^ 0)  OR 

( fi  < 0,  f4  > 0,  fj  > 0)  OR 

( fi  ^ 0,  fj  < 0,  fj  < 0)  OR 

( fi  ^ 0,  f2  ^ 0,  fj  < 0,  f4  ^ 0,  fj  ^ 0) 

Tests  enumerated  in  Step  3 can  become  difficult  to  implement. 
Generally  the  more  complex  the  polygonal  shape,  the  more 
involved  the  test  for  intersection  becomes.  The  major  problem 
with  devising  a depth  comparator  for  arbitrary  object 
primitives  is  that  such  objects  do  not  have  a regular 
(uniform)  shape  and  whenever  we  employ  a new  primitive,  the 
comparator  must  be  modified  to  accommodate  the  comparisons 
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fl  = 0 


Figure  4.1  A general  polygonal  surface  element 
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required  by  the  new  primitive.  So  primitive  sortinq  is  not 
easy  and  not  necessarily  a qood  idea.  If  we  have  the  same  type 
of  object  (i.e.  bounding  volume),  we  can  sort  the  depths  of 
objects  with  respect  to  the  ray  direction.  If  the  bounding 
volume  is  relatively  simple,  ray  intersections  with  the  actual 
object  points  may  be  not  difficult  to  find. 

From  the  above  example  we  know  that  the  primitive  depth 
sorting  is  not  easy;  however,  bounding  volume  sorting  could  be 
easy.  When  we  employ  the  bounding  volume  algorithm  to  sort 
depth  with  respect  to  the  ray,  we  must  consider  the  following 
questions . 

1.  What  kind  of  bounding  shape  gives  the  simplest  intersection 
test? 

2.  What  kind  of  bounding  volumes  can  be  fast  and  easily  sorted 
with  simple  calculations? 

There  are  lots  of  bounding  volumes  which  have  regular 
shape.  Box  and  sphere  are  representive . Other  volumes  could  be 
more  complicated  than  these.  Intersection  test  of  sphere 
bounding  volume  is  easier  than  that  of  box  bounding.  For  a 
simple  object  as  shown  in  Figure  4.2  (3),  we  can  bound  it 
using  a sphere  as  shown  in  Figure  4.2  (b).  The  void  area  of 
a bounding  volume  is  defined  as  the  difference  in  area  between 
the  orthogonal  projections  of  the  object  and  bounding  volume 
onto  a plane  perpendicular  to  the  ray  and  ray  the  origin  of 
the  ray  [Weg  84].  Sphere  bounding  results  in  a very  simple 
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Figure  4.2  Bounding  Volume 
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intersection  test.  However,  sphere  bounding  may  not  be  proper 
for  many  shapes.  Kay  and  Kajiya  presented  a method  of 
handling  box  bounding  based  on  slabs  [Kay  86].  A slab  is 
simply  the  space  between  two  parallel  planes . The  intersection 
of  a set  of  slabs  defines  the  bounding  volume.  This  bounding 
volume  method  does  not  overcome  the  void  area  problem  for  this 
kind  of  object,  as  illustrated  in  Figure  4.2  (a).  This 
algorithm  also  needs  to  build  a hierarchy  structure  of  the 
objects  and  bounding  volumes  in  image  space.  Drawbacks  of  the 
hierarchical  bounding  volume  is  addressed  by  techniques  for 
generating  bounding  volume  hierarchies  automatically  [Nic  94]  . 
Hardware  implementation  for  box  bounding  volume  intersection 
results  in  machinery  that  is  not  simpler  than  that  for  sphere 
bounding  volumes . 

Even  though  we  should  not  consider  the  void  area  problem, 
we  need  to  look  at  the  volume  link  problem  as  can  be  seen  from 
Figure  4.3.  One  advantage  of  the  ray  tracing  algorithm  is  that 
we  can  freely  move  the  viewpoint.  Objects  are  bounded  by 
bounding  volumes  and  bounding  volumes  are  linked  by  pointers . 
For  three  bounding  volumes,  there  are  six  possible  links  as 
shows  in  Figure  4.3.  For  the  linked  list  (a),  the  bounding 
volume  algorithm  performs  intersection  tests  first  with 
bounding  volume  3 and  then  tests  with  objects  in  volume  3. 
Even  though  the  ray  hits  volume  3,  it  does  not  hit  any  object 
inside  volume  3.  The  ray  tracing  algorithm  repeats  the  same 
procedure  for  volume  2,  but  fails  to  find  any  intersection. 
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Figure  4.3  Bounding  Volume  Link  Lists 
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Finally  it  tests  volume  1 and  finds  the  nearest  object  from 
the  viewpoint.  In  this  worst  case,  the  bounding  volume  ray 
tracing  algorithm  performs  the  whole  intersection  with  objects 
and  bounding  volumes. 

The  best  case  is  in  Figure  4.3  (f).  That  algorithm 
intersection  test  with  volume  1 and  finds  the  nearest 
object.  The  algorithm  keeps  the  nearest  hit  distance  until  all 
intersections  are  performed.  When  the  next  intersection  with 
the  ray  and  volume  is  being  calculated,  the  algorithm  compares 
only  the  nearest  distance  and  updates  it  or  bypasses  those 
steps.  The  problem  in  this  situation  is  how  to  keep  the  best 
linked  list  for  any  viewpoint,  and  for  all  rays.  Spatial 
subdivision  algorithm  gives  a solution  for  this  situation,  by 
sorting  the  ray  hit  spaces  with  respect  to  ray  direction.  But 
to  solve  the  void  area  problem,  the  spatial  subdivision 
algorithm  divides  the  space  more  finely;  however,  that  in  turn 
requires  bigger  arrays.  The  memory  representing  image  space 
will  be  proportional  to  n^  e.g.  if  latitude,  longitude, 
height  axis  are  divided  by  10  then  a 10x10x10  array  space  is 
required  to  represent  the  image  space.  Yet  another  problem  is 
the  inside  test,  which  was  mentioned  in  the  previous  chapter. 

So  far  we  found  the  following  problems  with  the  bounding 
volume  algorithm  and  spatial  subdivision  algorithm: 

1.  Void  areas  in  bounding  volumes; 

2.  Link  status  in  bounding  volumes; 
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3 . Memory  space  representing  image  space  in  spatial 
subdivision  algorithm; 

4.  Inside  test  in  spatial  subdivision  algorithm. 

By  bounding  sub— space  instead  of  object  we  can  overcome 
the  problems  listed  above.  A new  bounding  algorithm  based  on 
above  considerations  is  proposed  in  Figure  4.4. 

Image  space  is  properly  divided.  This  divided  subspace 
is  a bounding  volume  if  that  contains  at  least  a part  of  an 
object  or  an  entire  object.  So  an  object  can  be  bounded  by 
several  bounding  volumes  or  one  bounding  volume  can  contain 
several  objects  in  this  bounding  volume  algorithm.  This 
bounding  strategy  is  to  ease  the  void  area  problem.  The  next 
procedure  is  to  sort  those  bounding  volumes  with  respect  to 
the  ray.  The  algorithm  will  inspect  objects  in  each  sorted 
bounding  volume.  If  the  ray  hits  the  nearest  object,  then  no 
more  calculations  are  reguired  for  the  intersections  test.  For 
example  in  Figure  4.4  (b),  spaces  1,2,3,4,5,7,9,12,14,  can  not 
be  bounding  volumes  because  they  contain  neither  part  of  an 
object  nor  entire  objects.  Six  bounding  volumes  will  be  sorted 
with  respect  to  the  ray  direction  as  shows  in  (c).  The 
algorithm  will  inspect  volume  11  to  find  the  nearest  object  A. 
After  finding  the  nearest  object  in  the  bounding  volume 
nearest  to  the  viewpoint,  the  algorithm  quits  inspection. 
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Figure  4 . 4 New  Algorithm  and  Data  structure 
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Figure  4.4  Continued 
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4.3  Possible  Problems  with  the  Naw  BonnHina  Volnmf^s 

The  new  bounding  volume  algorithm  considers  two  problems 
in  ray  tracing.  The  first  one  is  to  make  bounding  volumes 
compact  to  avoid  the  void  area  problem  by  allowing  objects  to 
overlap  in  the  bounding  volumes.  So  any  shape  could  be  a 
possible  candidate  for  bounding  volumes.  We  choose  shapes  that 
9-^^®  intersection  tests  and  are  easily  sorted  with 
respect  to  the  ray.  We  also  need  to  consider  possible  hardware 
construction  implications.  Let's  consider  two  bounding 
volumes,  boxes  and  spheres. 

4.3.1  Box  Bounding  Volumes 

Box  bounding  volume  is  defined  by  six  planes.  Bounding 
volume  is  described  by  six  plane  equations.  To  find  the 
intersection  parameter  value  with  a ray,  the  algorithm 
performs  at  least  6 division  operations  for  each  box. 

Division  algorithm  takes  three  or  four  times  as  long  to 
compute  in  most  implementations  as  multiplication. 
Furthermore,  the  division  algorithm  tends  to  be  difficult  to 
pipeline  due  to  the  dependencies  inherent  in  selecting 
quotient  bits  [Flo  89]  [Ere  94].  To  avoid  division  operations 
in  the  sort  procedures,  we  compare  the  depths  of  box  bounding 
volumes  using  coefficients  of  plane  equations  . In  this  case 
the  inside  test  is  not  easy.  For  example.  Figure  4.5  shows  a 
two-dimensional  box  bounding  volume  containing  simple  objects 
A and  B.  Ray  R1  starts  from  outside  of  the  two 


Figure  4.5  Inside  test  and  overlap  test 
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bounding  boxes  1 and  2 . Some  proper  algorithm  sorts  the  two 
bounding  boxes  with  respect  to  the  ray  direction  R1 . when  we 
calculate  the  intersection  of  ray  R1  and  objects  in  Box  1,  ray 
R1  hits  the  object  A at  Box2 . In  this  case  ray  R1  must  return 
object  B color  information.  To  prevent  this  situation,  we  must 
check  for  an  inside  test  as  we  remarked  in  the  previous 
chapter. 

When  the  ray  hits  object  A,  we  know  only  the  intersection 
distance,  basically  a parameter  value  (t) . From  this  value  t 
we  can  find  the  intersection  location  using  the  ray  equation. 
After  finding  the  intersection  location,  we  need  to  check 
whether  that  location  is  in  the  bounding  volume  or  not.  These 
procedures  need  many  operations  (multiplications,  additions, 
comparisons).  This  inside  test  is  not  easy  for  box  bounding. 
Implementation  of  hardware  is  also  more  complicated  than  the 
box  sorter  when  the  inside  test  is  added  to  the  hardware. 

4.3.2  Sphere  Bounding  Volumfis 

Sphere  bounding  volume  also  has  some  of  the  same  problems 
that  the  procedure  that  uses  box  bounding  volumes . One 
advantage  of  the  sphere  is  that  the  volume  is  easier  to 
specify,  i.e.  only  one  equation  is  enough  to  represent  a 
sphere.  Because  bounding  volume  equation  is  quadratic,  we  need 
a square  root  function  to  solve  for  intersection  distance. 


70 


While  the  design  of  fast  and  efficient  adders  and 
multiplier  is  well-understood,  division  and  square  root  remain 
serious  design  challenges.  The  reasons  are  the  intrinsic 
dependence  among  the  iteration  steps  and  the  complexity  of  the 
result-digit  generation  function  [Ere  94].  So  sorting  depth 
with  respect  to  ray  intersection  distance  may  not  be  a good 
idea.  However,  comparing  the  coefficients  of  the  bounding 
volume  equation  and  the  ray  direction  provides  a clue  to  sort 
depths  of  the  bounding  volumes.  This  approach  will  be  given  in 
section  4.5.,  where  we  discuss  the  object  intersection  with 
the  ray.  Using  coeffients  from  the  bounding  volume  equations 
and  object  intersection  depth  parameter  t,  we  can  easily  check 
the  inside  test.  The  other  problem  is  to  avoid  the 
intersection  test  with  the  object  already  found  to  have  not 
intersected  the  ray.  Figure  4.5  (b)  shows  two  sphere  bounding 
volumes  which  contain  object  A and  (A  and  B).  Ray  R2  passes 
through  volumes  1 and  2.  A sort  algorithm  sorts  two  volumes. 
At  object  intersection  stage,  the  intersection  algorithm 
performs  a test  whether  R2  hits  object  A in  SPl  or  not.  In 
this  example  R2  misses  object  A.  For  the  next  bounding  volume, 
another  intersection  test  will  carried  out.  We  want  to  avoid 
the  object  intersection  test  for  A,  because  we  already  know 
that  object  A was  already  missed  by  R2  . This  problem  also  will 
be  considered  in  section  4.6. 
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4.4  Bounding  of  Objects 

The  new  bounding  algorithm  consists  of  two  steps: 

1.  The  object  bounding  stage; 

2.  The  ray  calculation  stage. 

In  the  first  stage  the  objects  in  the  image  space  are 

partitioned  by  invisible  spherical  surfaces.  After  the  first 
stage,  the  algorithm  computes  the  ray  at  each  pixel.  The  main 
objective  of  the  object  bounding  stage  is  to  reduce  the  void 
volume  in  each  sphere  bounding  volume.  Generally  we  know  that 
if  we  employ  many  bounding  volumes,  we  could  make  the  void 
volume  very  small.  But  if  the  sorting  time  involved  in  the 
examination  of  the  bounding  volume  is  larger  than  the 

calculation  of  the  object  intersections,  there  is  no 
advantage.  Even  though  we  may  implement  hardware  for  fast 
sorting  of  bounding  volumes,  this  hardware  will  require  a 
large  memory  to  hold  the  data  for  the  many  bounding  volumes. 

In  the  new  bounding  algorithm,  sorting  is  done  for 

finding  the  nearest  bounding  volume  for  each  sphere.  The 

sorting  ultimately  should  be  done  in  hardware  and  the 
intersection  tests  with  objects  and  ray  will  be  done  by 
software.  The  complexity  of  the  hardware  will  be  dictated  by 
the  number  of  spherical  bounding  volumes  it  is  to  handle. 
Let's  assume  that  the  depth  sorter  (an  implemented  hardware 
for  sorting  sphere  bounding  volumes ) can  handle  N sphere 

bounding  volumes . 
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There  are  two  ways  to  assign  bounding  spheres: 

1.  A priori  assignment,  typically  by  the  user's 
examination  of  the  structure  of  the  object  assembly; 

2.  Automatic  bounding,  using  an  explicit  algorithm  to 
cluster  the  objects  in  the  scene. 

We  examine  the  algorithm  for  the  clustering  process.  Usually 
we  know  the  location  of  objects  in  the  scene  and  the  nature  of 
the  objects.  We  partition  image  space  into  N boxes.  By 
shrinking  each  axis  of  the  rectangular  boxes,  we  make  each  box 
as  compact  as  possible.  When  we  bound  the  objects  the  best 
rule  to  minimize  void  areas  is  that  the  bounding  box  shape  is 
made  cubical.  After  making  boxes  of  desirable  shape,  we 
tightly  wrap  each  box  using  a circumscribing  sphere.  This 
wrapping  procedure  will  be  the  same  that  of  the  automatic 
bounding  which  will  be  discussed  later.  After  assigning  sphere 
bounding  volumes,  the  algorithm  inspects  each  object  for  a 
sphere  bounding  volume  whether  objects  are  already  included  in 
the  bounding  volume  or  not.  For  simple  image  space,  bounding 
by  visual  inspection  is  very  easy.  When  the  image  space  has 
many  objects  and  the  space  partition  is  not  simple,  the 
objects  bounding  step  may  take  many  calculations. 

The  automatic  bounding  procedure  is  simpler  to  employ 
than  bounding  made  by  inspection.  Figure  4.6  shows  the 
automatic  bounding  procedures  for  two-dimensional  object 
space.  (We  use  two-dimensional  space  for  illustration  only; 
the  algorithm  is  designed  for  three-dimensional  application.) 


(b) 


Figure  4.6  Bounding  procedure  of  New  Algorithm 
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This  space  has  4 objects  A,B,C,D.  Let's  assume  N=6,  i.e.  this 
algorithm  can  employ  up  to  6 sphere  bounding  volumes  (in  3- 
space . ) 

To  bound  objects,  we  must  know  the  extent  of  the  object 
space.  By  checking  each  object,  we  find  the  maximum  extent  of 
the  objects.  Figure  4.6  (a)  shows  this  extent.  The  aspect 
ratio  of  this  frame  is  approximately  3:2  so  this  space  is 
divided  as  6 boxes.  Each  box  in  the  example  has  large  void 
areas.  To  reduce  those  void  areas,  we  shrink  each  box  in  the 
principal  view  coordinate  dimensions  such  that  the  shrunk 
boxes  just  enclose  the  objects  or  parts  of  objects.  We  may 
also  find  that  some  boxes  are  empty,  i.e.  contain  no  objects 
from  the  scene.  These  boxes  are  removed  from  the  list  of 
potential  bounding  boxes.  In  this  example  box  6 in  figure  4.6 
(b)  is  removed  from  the  candidate  list  for  bounding  volumes. 
We  can  easily  wrap  a sphere  around  each  rectangular  box.  For 
the  two-dimensional  case  used  as  an  example,  the  center  of  the 
sphere  is  the  intersection  point  of  the  two  diagonals  and  the 
radius  is  the  half  length  of  the  box  diagonal.  Employing  these 
circles  as  the  bounding  volume,  we  can  bound  objects  in  the 
image  space.  For  the  three-dimensional  case,  the  three- 
dimensional  bounding  spheres  become  circles  in  the  viewing 
space. 

A visual  inspection  for  tailored  bounding  volumes  can 
employ  all  N bounding  volumes  because  the  partition  is 
performed  by  the  human  operator.  Even  though  the  partition  of 
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space  is  not  easy,  there  are  many  alternate  methods  for 
assigning  the  bounding  spheres.  In  general  it  takes  a long 
time  to  bound  objects  for  an  arbitrary  scene.  The  automatic 
process  of  assigning  bounding  volumes  takes  a short  time  to 
bound  objects  in  the  image  space.  It  can  employ  up  to  N 
bounding  volumes . The  automatic  bounding  algorithm  shown  in 
the  example  in  Figure  4.6  uses  5 of  the  6 possible  bounding 
volumes.  In  this  example  one  bounding  volume  is  idle  during 
the  subsequent  sorting  process.  One  could  develop  a tighter 
algorithm,  at  the  expense  of  additional  execution  time,  which 
can  render  fewer  idle  bounding  volumes : the  bounding  volumes 
are  more  dense  to  reduce  the  void  area  problem.  However,  the 
additional  running  time  was  found  to  reduce  the  overall 
efficiency.  Hence  the  procedure  shown  here  is  a "best 
compromise"  solution  leading  to  the  greatest  observed 
improvements . 


4.5  Algorithm 

4.5.1  Filtering  and  Comparison 

There  are  many  sphere  bounding  volumes  which  were 
constructed  from  the  ideas  presented  in  the  previous  section. 
In  the  ray  calculation  stage,  we  need  to  find  the  spherical 
bounding  volumes  which  are  intersected  by  the  ray  and  sort  the 
distances  of  the  points  where  the  ray  intersects  the  bounding 
volume  with  respect  to  the  ray  direction.  After  discarding  the 
unintersected  sphere  bounding  volumes,  we  need  a norm  to 
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compare  depths,  two  spheres  at  a time.  This  norm  can  be 
derived  from  a comparison  between  the  coefficient  pairs  of 
quadratic  equations  that  describe  the  spheres. 

Let  d be  the  ray  direction  unit  vector  from  the  view 

point 

d = (dx,  dy,  dj  and  d^^+dy^+d,^  = 1 
Furthermore, 

Vo  is  the  viewpoint  (V,,  Vy,  V J ; 

Bj  is  the  center  of  the  bounding  sphere  (Xq,  Yo,  Zq);  and 
R is  the  radius  of  the  bounding  sphere. 

The  bounding  volume  equation  is 


(x-Xg)  2+  (y-7g)  2+  (z-Zg)  2=i?2 


(4.1) 


The  ray  equation  is 


(4.2) 


Substituting  Eg.  (4.1)  into  Eq.  (4.2),  we  get 


( td^-Xg ) 2 + ( y^+  td^-Xg ) 2 + ( td^-Xg ) 2 =i?2  (4,3) 


Reorder  Eq.  (4.3)  with  respect  to  t 


t2(dj  + d^+d^)  +t[2d^(V^-Xg)  +2dy,(Vy,-yg)  +2d^(V^-Zg)  ] 
+ ( V^-Xg)  2+  ( v^-y,)  2+  ( y^-zg)  2-i?2  = o 


(4.4) 


Let 


d=2d,(y,-Xg)  +2dy,(  Vyg)  +2d,(y,-Zg) 


(4.5) 
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C=  ( V^-x,)  2 + { V^o) 

Then  Eq.(4.4)  becomes 

t^+bt+c=0 


(4.6) 


(4.7) 


The  intersection  of  the  parametric  ray  with  the  sphere  is 
characterized  by  the  quadratic  equation  (4.7).  Each  bounding 
volume  will  have  a different  set  of  coefficients  {b,c}  for 
every  ray.  Let  t^+bjt+Ci  ~ 0/  t^+b2t+C2  = 0 be  the  two 
parametric  intersection  equations  for  the  sphere  bounding 
volumes  1 and  2 respectively.  Using  the  coefficient  b,c  and 
only  simple  mathematics,  we  can  compare  the  depths  of  the 
sphere  bounding  volumes  with  respect  to  the  ray  direction. 
Before  developing  the  algorithm,  we  define  discrimination 
equation  D and  four  state  variables  Xi,X2,X3,X4. 


X,  = 


X,  = 


X,  = 


X.  = 


bj  < bi 
b2  ^ bi 

4(c,-C2):Sbi^-b2^ 

4 ( c,-C2)>b,^-b2^ 

bib2<2  ( C1+C2) 
bib2^2(Ci+C2) 

(bjCi  -b,C2)  (bi-b2)<(c,-C2)^ 
(b2Ct  -biC2)  (b,-b2)^(Ci-C2)^ 


£>=jb^-4c 


(4.8) 


Let's  consider  the  following  two  quadratic  equations 


( t)  =t^+b^t  + c^ 


(4.9) 
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f2it)  =t^+b^t+c^  (4.10) 

Because  a non-positive  intersection  value  is  always 
meaningless  in  ray  tracing,  we  are  interested  in  only  the 
smallest  positive  root  of  each  equation. 

Three  spherical  bounding  volumes  (spl,  sp2,  and  sp3)  are 
shown  in  Figure  4.7(b).  The  ray  R starts  at  the  origin  P with 
a direction.  Each  bounding  volume  contains  objects.  The  ray  R 
has  no  intersection  with  sp2  or  sp3 . So  any  object  in  sp2  or 
sp3  will  not  be  intersected  with  the  ray  R.  Physically,  the 
sphere  bounding  volume  spl  does  not  meet  the  ray,  but 
mathematically  that  volume  has  an  intersection  with  the  ray. 
The  curve  fi(t)  shows  the  relationship  between  physical 
interpretation  and  mathematical  interpretation.  The 
intersection  of  spl  with  the  ray  R occurs  for  negative  values 
of  the  parameter  t.  Two  negative  real  roots  of  fi(t)  mean 
that  the  sphere  bounding  volume  is  located  behind  the  ray 
• Since  we  are  only  interested  in  the  objects  which  are 
in  the  direction  of  the  ray,  we  need  to  remove  those  bounding 
volumes  at  the  pre-sort  stage  in  the  ray  tracing  algorithm. 

Figure  (a)  in  4.7  gives  a clue  to  the  identity  of  the 
unintersected  bounding  volumes.  The  negative  discriminant  (D) 
in  equation  (4.8)  means  that  the  volume  has  no  intersection 
with  the  ray.  Even  though  we  calculate  intersections  with  the 
ray,  two  negative  intersection  values  of  the  parameter  t are 
useless  as  intersection  points  in  the  direction  of  the 
positive  ray,  as  can  be  ascertained  from  fi(t)  in  (a). 
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SP3 


Figure  4.7  Unintersected  bounding  volumes 
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The  center  of  fi(t)  is  in  the  left  half  plane  and  the 
intersection  with  y axis  is  positive.  So  when  the  coefficient 
pair  is  b;  ^ 0 and  Cj  > 0,  we  know  that  that  sphere  bounding 
volume  is  behind  the  origin  of  the  ray.  We  filter  out  those 
two  cases  in  a pre-sort  stage  with  simple  tests  of  the 
guadratic  function  coefficients: 

1.  f(t)  has  no  real  root; 

2.  f(t)  has  two  negative  roots. 

Case  1 : cl  > 0 and  c2  > 0 

Consider  the  two  sphere  bounding  volumes  in  the  image  space 
shown  in  Figure  4.8(b).  The  ray  R starts  at  the  outside  of 
both  sphere  bounding  volumes . The  related  mathematical  curves 
are  shown  in  (a) . All  roots  of  these  guadratic  equations  are 
positive  real.  This  means  that  two  bounding  volumes  are  in  the 
direction  of  the  ray  and  the  two  sphere  bounding  volumes  do 
not  include  the  origin  of  ray  P.  The  ray  R start  at  P and  hits 
spl  at  A,  and  sp2  at  B.  We  define  depth  as  a length  from  the 
of  the  ray  P to  the  nearest  ray  intersection  position 
on  the  sphere  bounding  volume.  For  example  the  length  PA  is 
the  depth  of  spl  and  the  length  PB  is  that  of  sp2 . The  related 
intersection  values  of  the  parameter  t in  Figure  4.8(a)  are  t^ 
and  tg . These  intersection  values  are  proportional  to  the 
actual  depths . The  two  bounding  volumes  can  be  overlapping  or 
separated  as  seen  in  Figure  4.8(b).  In  either  any  case  we  can 
compare  the  depth  with  respect  to  the  ray  path  without  using 
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Figure  4.8  Case  1 
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uhe  square  root  function.  The  Appendix  gives  details  of  the 
mathematical  steps.  Using  a flow  chart  and  state  variables,  we 
summarize  these  mathematical  steps  in  Figure  4.9.  Table  in 
(b)  of  Figure  4.9  gives  the  selection  condition  tj  by  using 
only  the  state  variable  equation. 

ii  = 

-X1X3  +X3X4  +X2X3X4  (4.11) 

Consider  one  more  thing  about  the  curve  in  Figure  4.8. 
spl  contains  a part  of  a triangle  and  sp2  also  has  a part  of 
the  same  triangle.  When  we  perform  the  intersection  test  for 
objects  in  spl,  we  get  the  intersection  point  I.  But  I is  not 
included  in  spl.  How  do  we  know  that  a intersected  position  is 
in  a bounding  volume?  We  call  this  the  "inside"  test.  If  we 
know  the  intersection  values  of  the  bounding  volumes 
(tA;t0,tc,tD  in  this  example),  the  comparison  of  these  values 
( ^ tj  ^ tc  or  tg  ^ tj  to  ) gives  the  proper  information. 
But  we  want  to  avoid  using  the  square  root  function  and  use 
only  the  coefficients  b,c  to  compare  depths. 

This  comparison  test  indicated  above  does  not  work 
directly  for  objects  enclosed  in  multiple  spherical  bounding 
volumes.  Let's  consider  the  meaning  of  fi(t)  . fi(t)  = 0 means 
that  ray  R is  on  the  surface  of  volume  i at  the  parameter 
value  t.  fi(t)  < 0 means  that  t is  between  two  intersection 
values,  so  the  ray  R is  inside  of  the  volume  i.  fj(t)  > 0 
means  that  the  ray  R is  outside  of  volume  i.  From  this 
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Figure  4 . 9 State  variables  and  Depth  comparison 


84 


criterion  we  know  that  intersection  happens  in  sp2  because 
fi(tj)  > 0,  f2(tj)  < 0,  where 

fiit)={t+b^)*t  + Cj^  (i=l,2)  (4.12) 

Even  though  we  did  not  calculate  the  intersection  values  of 
the  sphere  bounding  volumes  with  the  ray,  we  can  compare  those 
depths  and  inside  test  using  only  a pair  of  quadratic  equation 
coefficients . 

Case  2 : cl  < 0 and  c2  < 0 

The  roots  of  each  function  have  a positive  and  a negative 
root.  This  means  that  the  ray  starts  at  the  intersected 
common  part  of  each  sphere  bounding  volume.  Figure  4.10  (b) 
shows  this  situation.  We  defined  the  depth  as  a length  from 
the  origin  of  the  ray  to  the  nearest  intersection  point  of  the 
bounding  volume.  We  apply  that  definition  to  this  situation. 
The  purpose  of  depth  definition  is  to  find  the  order  of  sphere 
bounding  volumes  with  respect  to  the  ray  path.  Adding  this 
concept  to  the  depth  definition,  we  easily  avoid  the 
complicated  selection  equations  shown  in  the  previous  case. 

When  we  calculate  reflected  color,  the  origin  of  the  ray 
is  changed  from  the  view  point  to  the  intersection  point.  If 
the  intersection  point  is  in  the  common  part  of  volume  1 and 
volume  2,  the  secondary  ray  starts  from  that  intersection 
point.  In  Figure  4.10  the  depth  of  spl  is  less  than  that  of 
sp2,  but  the  ray  will  intersect  with  both  spl  and  sp2 . We 
don't  know  how  to  find  the  exact  order  of  these  two  volumes. 
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Figure  4.10  Case  2 
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since  both  bounding  volumes  contain  the  origin  of  the  ray. 
When  we  compare  the  depths  of  volumes  we  can  select  either 
volume  so  we  can  give  a same  order  for  two  volumes  by  defining 
that  the  depth  is  zero  when  the  sphere  bounding  volume 
contains  the  origin  of  the  ray.  The  selection  condition  t,  is 
a don't  care. 

Case  3:  cl  < 0 and  c2  > 0 

Because  function  f](t)  has  a negative  root  and  a positive 
root,  we  can  infer  that  the  ray  starts  from  the  inside  of 
volume  1 in  Figure  4.11.  But  function  f2(t)  has  two  positive 
roots.  These  two  curves  mean  that  sphere  bounding  volume  2 
does  not  contain  the  origin  of  the  ray,  but  bounding  volume  1 
does.  These  sphere  bounding  volumes  can  be  overlapped  or 
separated.  In  this  figure  even  though  the  ray  meets  volume  2 
at  tg,  we  can  not  give  higher  priority  for  volume  2.  Since 
volume  1 contains  the  origin  of  the  ray,  the  depth  is  zero.  So 

volume  1 must  be  selected  in  this  comparison.  The  selection 
eguation  is 

= = l (4.13) 

Case  4:  cl  > 0 and  c2  < 0 

This  is  the  opposite  of  the  situation  of  case  3.  Bounding 
volume  2 has  the  ray  starting  point,  rather  than  volume  1,  so 
the  selection  eguation  must  select  volume  2 . This  situation  is 
shown  in  Figure  4.12.  The  selection  equation  is 
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f(t) 


(b) 


Figure  4.11  Case  3 
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(a) 


(b) 


Figure  4.12  Case  4 
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-^4  = ^^i  = 0 (4.14) 

If  we  define  two  more  state  variables  yj  and  72,  we  can  combine 
these  four  cases  with  respect  to  t, 


Yi=  1 

(Ci>0) 

1 0 

(Ci<0) 

72=  1 1 

(C2>0) 

[ 0 

(C2^0) 

1^172  + 3yiy2  + ^47172  = 1 17172  + ^37i72  =72i^i7i+7i)  (4.15) 

if  ti  = 1 then  select  fj  else  select  f2.  Using  only 
coefficients  b and  c,  we  can  compare  the  two  bounding  volume 
depths  with  respect  to  the  ray  path.  Repeating  this 
comparison,  we  can  sort  the  depth  with  respect  to  the  ray  for 
any  pixel  or  for  any  direction. 

4.5.2.  Sorting 

This  is  the  process  of  arranging  the  sphere  bounding 
volumes  with  respect  to  depth.  The  arrangement  of  sphere 
bounding  volumes  is  undertaken  so  that  succeeding  processes 
may  find  the  nearest  object  from  the  ray  origin  with  fewer 
intersection  tests  than  is  needed  in  the  basic  ray  tracing 
algorithm.  Even  though  we  employ  many  spherical  bounding 
volumes,  only  a few  of  them  intersect  with  a given  ray. 
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The  sort  is  performed  with  the  exchange  algorithm  [Lor  75]. 
The  inputs  of  this  sort  are  the  number  of  intersected  bounding 
volumes  n and  n sets  of  three  numbers  (the  volume  identifier 
and  coefficients  b,  c)  which  represent  a sphere  bounding 
volume.  This  sort  algorithm  is  presented  in  Figure  4.13. 

4.5.3  Ray  Tracing  AlaorithTn 

Ray  tracing  is  an  algorithm  that  works  entirely  in  object 
space.  At  a given  point  in  the  image  plane,  the  visible 
surfaces  are  obtained  by  tracing  a ray  backwards  from  the  eye 
through  the  imaging  point  into  the  scene.  If  this  ray 
intersects  an  object,  then  local  color  calculations  will 
determine  the  color  that  is  the  result  of  illumination  at  that 
point.  This  is  light  from  the  light  sources  directly  reflected 
at  the  surface.  If  the  object  is  partially  reflective, 
partially  transparent,  or  both,  then  the  color  of  the  point  in 
the  image  plane  should  include  a contribution  from  reflected 
and  transmitted  rays.  These  must  be  traced  backwards  to 
discover  their  origin,  and  hence  the  light  they  contribute. 
Determining  a color  for  each  of  these  rays  may  require  the 
tracing  of  further  rays  and  other  intersections  with  objects. 
However,  the  ray  tracing  algorithm  spends  most  of  its  time  in 
the  intersection  calculations. 

To  improve  this  bottleneck  problem,  the  new  algorithm 
partitions  the  object  space  and  bounds  the  partitioned  space 
using  sphere  bounding  volumes  at  the  ray  tracing 
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procedure  SORT ( sort, n) 
array  sort[n]; 
integer  n; 
begin 

do  i = to  n 

select  0; 

temp  [ select  ].  set  sort[i]; 
temp [ select  ].  id  i ; 

if  ( i = n ) return ( sort ) ; 

do  j = i+1  to  n 

temp  [!  select  ].  set  <-  sort[j]; 
temp  [!  select  ].  id  <-  j ; 

select  <-  compare  ( temp  [ select  ],  temp  [!  select  ] ); 

enddo 

sort [ temp [ select ]. id]  ^ sort[i]; 
sort[i]  temp [ select ] ; 

enddo 


end 


Figure  4.13  Sort  Algorithm 
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initialization  stage.  In  calculating  the  intersection  with  an 
object  the  new  algorithm  first  performs  a ray  intersection 
with  the  sphere  bounding  volumes . After  intersection  tests 
with  the  sphere  bounding  volumes,  the  new  algorithm  sorts  the 
sphere  bounding  volumes  intersected  by  the  ray  with  respect  to 
depth.  After  sorting  volumes,  a ray  intersection  test  is 
P®tformed  with  objects  which  are  in  the  first  sorted  sphere 
bounding  volume.  Once  the  algorithm  finds  the  nearest  object 
from  the  ray  origin,  the  intersection  test  is  exited.  If  not, 
the  ray  intersection  test  with  other  objects  will  be 
continued,  until  the  last  bounding  volume  is  processed. 

Figure  4.14  shows  the  basic  ray  tracing  algorithm.  The 
suggested  new  algorithm  is  presented  in  Figure  4.15.  Trace, 
shade  and  intersect  routines  form  the  heart  of  the  ray  tracing 
algorithm.  The  trace  routine  is  emphasized  in  Figure  4.15 
because  shade  and  intersect  routines  could  be  the  same  as  in 
the  basic  ray  tracing  algorithm.  Each  of  these  algorithms 
works  with  a set  of  object  primitives,  often  just  a collection 
of  triangular  surface  facets.  We  can  define  many  kinds  of 
primitives  using  simple  mathematical  equations.  Some 
primitives  can  not  be  bounded  by  a finite  number  of  sphere 
bounding  volumes  because  they  are  infinite  objects.  For 
example  the  x-y  plane  or  a cylinder  defined  by  its  radius  and 
orientation  (without  giving  a length)  can  not  be  bounded  by  a 
finite  number  of  spherical  bounding  volumes.  If  we  consider 
such  primitives,  we  need  to  employ  two  lists,  one  for  sphere 
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procedure  Ray_trace  (start,  direction,  depth,  color) 
vector  : start,  direction 
integer  : depth 
colors  : color 

begin 

vector  : intersection_point , ref lected_direction, 
transmitted_direction 

colors  : local_color,  ref lected_color , transmitted_color 

color  <-  black  ; 
if  (depth  ^ maxdepth) 

begin 

color  <-  back_ground_color; 

[intersect  ray  with  all  objects  and  find  intersection 
point  (if  any)  that  is  closest  to  start  of  ray] 
if  (intersection) 

begin 

local_color  <-  [ contribution  of  local  color  model  at 

intersection  point  ] 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( inter section_point , ref lected_direction , 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( intersection_point , transmitted_direction , 
depth+1 , transmitted_color ) 

Combine  (color,  local_color,  local_weight_for_surface, 
ref lected_color, ref lected_weight_for_surface, 
transmitted_color , transmitted_weight_f or_surf ace ) 

end 

end 

end 


Figure  4.14  Ray  Tracing  Algorithm 
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procedure  Ray  trace  ( start,  direction,  depth,  color) 
vector  : start,  direction 
integer  : depth 
colors  : color 

begin 

vector  : inter section_point,  ref lected_direction, 
transmitted_  direction 

colors  : local_color,  ref lected_color , transmitted_color 

if  depth>maxdepth  then  color  block 

else 

begin 

color  <-  back_ground_color; 

{ intersect  ray  with  all  sphere  bounding  volumes  } 
if  (bounding  volume  intersection) 

begin 

{ sort  n sphere  bounding  volumes  which  are  intersected 
with  ray  } 

do  i = 1 to  n 

{ 1.  intersect  ray  with  all  objects  in  the  i th 
depth's  sphere  bounding  volume 

2.  find  intersection  point  (if  any)  that  is 
nearest  to  origin  of  ray 

3.  check  inside  test  for  nearest  point  whether 
it  is  true  or  not 

4.  update  the  nearest  point  if  item  §3  is  true 

5.  guit  do  loop  if  nearest  point  found  in  the 
ith  depth's  sphere  bounding  volume.) 

enddo 

end 

[intersect  ray  with  all  objects  in  infinite  objectlist  and 
update  intersection  point  (if  any)  that  is  nearest  to  origin 
of  ray] 

if  (intersection) 

begin 

local_color  <-  [ contribution  of  local  color  model  at 

intersection  point  ] 

{ Calculate  direction  of  reflected  ray  } 

Ray_trace  ( intersection_point, ref lected_direction, 
depth+1 , ref lected_color ) 

{ Calculate  direction  of  transmitted  ray  } 

Ray_trace  ( intersection_point , transmitted_direction , 
depth+1 , transmitted_color ) 

Combine  (color,  local_color , local_weight_for_surface, 
ref lected_color , ref lected_weight_for_sur face, 
transmitted_color , transmitted_weight_f or_surf ace ) 

end 

end 

end 


Figure  4.15  New  Algorithm 
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bounding  volumes  and  the  other  for  infinite  objects.  The  ray 
intersection  test  needs  to  consider  both  lists.  We  find  the 
nearest  objects  from  each  list,  and  decide  the  nearest  object 
by  comparing  two  intersection  values . And  also  we  must  check 
the  inside  test  for  finite  objects.  These  things  are 
considered  by  the  new  algorithm  outlined  in  Figure  4.15. 

4.6  Data  Structure 

There  are  many  kinds  of  data  types  which  are  required  to 
efficiently  process  ray  tracing  algorithms.  To  describe  the 
objects  in  the  image  space  we  need  to  define  and  link 
primitives.  Vector  operations  are  also  required  in  the  ray 
tracing  algorithm.  The  basic  data  structures  for  ray  tracing 
were  studied  by  Heckbert  [Hec  89].  Our  algorithm  is  designed 
to  alleviate  the  ray  intersection  test  with  all  objects.  This 
new  algorithm  employs  bounding  volumes  and  allows  overlapping 
of  objects  between  bounding  volumes. 

By  defining  new  data  structures  for  the  primitives  and 
the  link  list,  we  can  save  memory  space  and  reduce  ray 
tracing  execution  time. 

4.6.1.  Primitives 

One  of  the  goals  of  writing  a ray  trace  program  is  to 
make  it  easily  extensible  so  anybody  could  use  it  to  try  out 
various  advanced  ray  tracing  techniques.  This  requirement 
demands  an  object  oriented  programming.  Rather  than  group  the 
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software  by  procedures,  we  group  it  by  data  structures.  Thus 
instead  of  collecting  all  the  intersection  methods  into  one 
file  and  all  the  normal  vector  formulas  into  another,  we  split 
the  problem  another  way,  by  collecting  procedures  for  each 
primitive  type  into  a file  of  its  own.  For  example  there  is 
one  file  containing  sphere  related  routines,  another  file  for 
polygonal  related  routines,  etc.  This  has  the  advantage  that 
primitive-dependent  information  can  be  hidden  in  data 
structures  local  to  each  file  and  the  procedure  interfaces  can 
be  very  simple  and  generic.  Adding  new  primitives  to  the 
system  becomes  easy  with  this  scheme.  Since  the  details  of 
each  primitive's  data  structure  will  be  local  to  that  sub- 
module,  all  operations  on  the  primitive  must  be  supported  by 
generic  procedures,  the  most  important  of  which  are  procedures 
for  intersection,  for  normal  calculations , and  for  reading  the 
specification  for  a primitive  [Hec  89]. 

We  must  also  check  the  ray  field  descriptor  in  the 
object  whether  the  object  was  hit  by  the  ray  or  not.  Figure 
4.5  shows  two  primitives  in  bounding  volume  1.  The  test  result 
is  that  ray  R2  misses  object  A.  The  next  sphere  bounding 
volume  which  was  sorted  by  the  depth  sort  algorithm  is  volume 
2.  For  the  objects  in  volume  2 the  ray  intersection  test  will 
be  the  next  procedure.  At  this  stage  we  know  that  ray  R2  does 
not  meet  object  A as  a result  of  previous  stage  of  the 
calculations.  However,  the  ray  tracing  algorithm  does  not 
keep  track  of  previous  intersections,  so  the  ray  tracing 
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algorithm  again  performs  the  ray  intersection  test  for  object 
A.  The  ray  intersection  test  takes  more  time  than  that  for  the 
comparison  operation.  By  employing  a ray  identification  field 
in  the  primitive,  we  can  overcome  this  problem.  When  the  ray 
R2  tests  for  intersection  with  object  A,  the  algorithm  checks 
whether  the  ray  id  field  of  the  object  A is  R2  or  not.  If  the 
field  is  not  R2 , then  the  algorithm  performs  the  ray 
intersection  test  with  object  A.  If  the  ray  misses  object  A, 
then  the  algorithm  writes  R2  in  the  ray  id  field  of  object  A. 
If  ray  hits  object  A at  the  other  volume  ( i.e,  a ray 
intersection  with  object  A happened  but  the  inside  test  failed 
) , then  the  algorithm  does  not  write  R2  in  the  ray  id  field  of 
object  A.  In  volume  2 ray  R2  does  not  need  the  intersection 
test  with  object  A.  When  the  algorithm  checks  the  ray  id 
field,  the  algorithm  will  find  that  object  A has  been  tested 
in  the  previous  volumes  and  ray  R2  has  missed  object  A.  Thus 
we  can  avoid  unnecessary  intersection  tests  by  employing  a ray 
id  field  with  each  finite  primitive. 

4.6.2  Link  List 

There  are  three  kinds  of  objects  in  the  new  algorithm. 
One  is  the  finite  object  which  can  be  bounded  by  finite  number 
of  sphere  bounding  volumes  and  the  second  is  the  infinite 
object  which  can  not  be  bounded  by  finite  number  of  sphere 
bounding  volumes.  The  last  is  a light  list.  Finite  objects 
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are  partitioned  by  bounding  procedures  and  each  sphere 
bounding  volume  has  its  own  link  list. 

The  new  depth  sort  algorithm  allows  overlapping  objects 
between  sphere  bounding  volumes.  Figure  4.16  shows  four 
objects  which  are  bounded  by  three  sphere  bounding  volumes.  To 
represent  this  in  the  image  space  we  use  eight  objects,  two 
sets  of  A,B,C,D.  Each  object  needs  a bigger  memory  space  than 
that  of  two  pointer  fields.  Because  each  object  has  its  own 
primitive  specification  fields,  pointer  fields  are  needed  for 
the  intersect  procedure  and  the  normal  vector  procedure.  This 
kind  of  link  structure  is  proper  for  a small  number  of  objects 
in  an  image  space.  For  a large  number  of  objects  in  an  image 
space  we  need  to  employ  a new  link  structure  to  save  memory 
space.  Such  a new  link  structure  is  presented  in  Figure  4.16 
(c).  There  are  object  lists  which  link  every  finite  object  in 
the  image  space.  Each  sphere  bounding  volume  is  linked  in  the 
new  data  structure.  The  new  data  structure  has  two  fields.  One 
is  for  linking  the  next  item,  the  other  field  for  pointing  to 
the  corresponding  object.  So  instead  of  eight  objects,  eight 
pointers  are  enough  to  represent  objects  in  the  image. 

4.6.3  Sorting 

The  sort  procedures  require  two  buffer  registers  to 
compare  the  depths  of  two  sphere  bounding  volumes.  Each  sphere 
bounding  volume  in  the  sort  procedure  is  represented  by  three 
items:  two  coefficients  b,c  of  the  quadratic  equation  and  an 
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identifier  of  the  sphere  bounding  volume.  The  candidate 
bounding  volumes  are  stored  in  the  array  of  memory. 

Because  the  sorting  strategy  is  a linear  selection  with 
exchange,  we  need  to  keep  the  array  index  of  the  nearest 
volume  during  the  comparison  procedure.  The  data  structure  for 
two  buffer  registers  consists  of  three  pairs  (b,c,  id)  and  the 
array  position  of  the  nearest  object.  Two  buffer  registers 
make  exchange  easy  between  the  two  sphere  bounding  volumes . 

4.7  Hardware  Consideration 
4.7.1  Introduction 

The  basic  idea  of  the  new  algorithm  is  to  reduce  the 
number  of  ray  intersections  with  objects  in  an  image  space 
which  one  needs  to  consider  because  the  ray  tracing  algorithm 
spends  most  of  its  execution  time  in  the  intersection  stage. 
We  employ  sphere  bounding  volumes  to  find  the  nearest  object. 

Since  sphere  bounding  volume  strategy  has  the  void  volume 
problem,  we  allow  object  overlap  in  a few  sphere  bounding 
volumes.  This  requires  many  interaction  point  calculations  of 
sphere  bounding  volumes  to  sort  depths  with  respect  to  the  ray 
path.  We  know  that  the  nearest  bounding  volume  has  a high 
probability  of  containing  the  nearest  object.  We  need  not 
calculate  the  intersection  point  of  bounding  volumes,  we  only 
need  to  calculate  the  intersection  status  whether  the  ray  hits 
bounding  volumes  or  not  to  sort  depth  and  to  reduce  execution 


time . 
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When  we  employ  a small  number  of  bounding  volumes,  each 
bounding  volume  contains  more  objects  than  when  we  employ  many 
bounding  volumes.  Ray-object  intersection  calculation  time 
will  be  considerable  for  each  sphere  bounding  volume.  When  we 
employ  a large  number  of  bounding  volumes  the  void  area  of 
each  bounding  volume  will  be  smaller  than  when  we  employ  a 
small  number  of  bounding  volumes.  We  already  discussed 
possible  hardware  implementation  of  the  ray-object  depth 
sorter  in  section  4.2.  The  bottleneck  which  is  caused  by 
employing  a large  number  of  bounding  volumes  can  be  solved  in 
the  design  of  the  hardware.  Since  this  hardware  only  considers 
the  sphere  bounding  volumes,  it  has  no  problem  when  we  employ 
new  primitives  to  describe  image  parts.  So  using  this 
hardware,  we  can  extend  the  ray  tracing  algorithm  through 
software . 

That  hardware  performs  two  functions : 

1.  ray  intersection  test  with  every  sphere  bounding  volume 
for  a given  ray  information  (direction,  origin); 

2 . depth  sorting  for  output  of  item  1 . 

Whenever  a ray  direction  is  changed,  the  intersection  part 
must  perform  tests  for  all  bounding  volumes.  This  task  may  be 
computationally  very  intensive  because  there  are  many  rays 
which  must  be  considered  during  the  execution  of  the  ray 
tracing  algorithm.  Since  the  intersection  part  and  the  depth 
test  procedure  are  always  in  fixed  order,  we  can  implement 
this  intersection  part  as  a pipeline. 
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But  the  output  of  intersection  part  is  not  stable.  For 
example  in  some  cases  the  ray  does  not  hit  any  sphere  bounding 
volume  and  in  some  cases  the  ray  hits  some  of  the  sphere 
bounding  volumes.  Depth  sorting  of  the  sphere  bounding  volumes 
is  performed  on  the  output  of  the  intersection  part  ( presort 
part  ) . The  number  of  bounding  volumes  being  sorted  varies  for 
each  ray  and  the  time  for  sorting  is  not  fixed.  So  the  direct 
connection  between  the  ray  intersection  part  and  depth  sorting 
is  not  well  designed.  There  must  be  an  array  of  memory  between 
them  to  act  as  a buffer. 

This  array  is  used  for  output  storage  of  the  intersection 
parts  and  also  is  used  for  input  to  the  depth  sorter.  Figure 
4.17  shows  the  block  diagram  of  this  hardware.  The  contents  of 
the  sphere  bounding  volume  in  Figure  4.17  is  not  changed 
during  the  execution  of  ray  tracing  and  those  values  are 
assigned  at  the  initialization  stage.  Whenever  ray  information 
is  changed,  the  contents  of  the  buffer  is  changed. 

4.7.2  The  Ray  Intersector 

This  device  performs  the  ray  intersection  tests  with  all 
spherical  bounding  volumes.  Since  the  information  about  these 
volumes  is  always  used  for  testing  ray  intersections,  the 
information  is  stored  in  a memory-resident  array.  The 
information  consists  of  the  center  of  the  sphere,  the  radius 
of  the  sphere,  and  a number  that  identifies  the  sphere. 
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Figure  4.17  Block  Diagram  of  Hardware 
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The  ray  intersector  has  two  parts,  as  shown  in  Figure 
4.18(a),  the  Bounding  Volume  Coefficient  Generator  (BVCG)  and 
the  ray  intersector  itself.  The  design  details  of  the  BVCG 
are  given  in  Figure  4.18(b).  A ray  is  defined  by  two  pieces 
of  information:  the  origin  and  the  direction.  These  pieces  of 
information  are  used  to  generate  the  coefficients  of  the 
quadratic  equations  (4.5)  and  (4.6)  for  each  bounding  sphere. 
To  obtain  b from  b/2  we  perform  a left  shift,  an  operation 
which  can  be  implemented  by  shifting  the  wiring  (or  merely 
labeling  the  bus  wires  with  a one-higher  subscript) . There 
are  five  (5)  steps  required  to  get  b;  the  coefficient  c 
requires  4 steps.  Each  row  represents  each  step  in  that 
picture.  When  we  implement  these  steps  as  parallel  pipeline 
elements,  the  coefficients  are  produced  at  the  same  pipeline 
stage . 

Using  these  two  coefficients  (b  and  c)  the  root  position 
test  (RPT)  of  the  ray  intersector  hardware  performs  the  root 
position  tests  whether  both  roots  are  real  or  not.  When  the 
root  tests  show  possible  real  intersection( s ) , the 
coefficients  and  the  volume  identifier  are  moved  to  a buffer 
for  sorting  calculations.  The  discrimination  test  D in 
Equation  (4.8)  yields  information  on  whether  the  roots  are 
real  or  not.  The  most  important  result  of  that  calculation  is 
the  sign  of  the  result,  because  a non-negative  sign  means  that 
ray  hits  bounding  volume.  This  makes  the  hardware  needed  for 
the  determination  of  real  roots  very  simple. 
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Figure  4.18  Block  Diagram  of  Ray  Intersector 
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Figure  4.18  Continued  (BVCG) 
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Figure  4.18  Continued  (Pipelined  Ray  Intersector) 
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We  can  obtain  the  values  of  b/2  and  c directly  from 
Figure  4.18(b).  The  details  of  the  design  of  the  RPT  are  shown 
in  Figure  4.18(c).  By  using  D/4  we  can  reduce  the  pipeline 
stages  by  one.  To  test  for  the  root  position  in  the  RPT 
module,  we  need  three  inputs  (the  sign  of  b/2,  the  sign  of  D, 
and  the  sign  of  c).  Different  implementations  may  use  varying 
formats  for  mantissa  length,  exponent  length,  radix,  encoding 
of  negative  numbers,  and  possible  use  of  a floating  point 
hidden  bit.  However,  in  microcomputer  systems  the  1985  IEEE 
floating  point  standard  is  becoming  widely  established.  In 
this  standard  the  sign  bit  is  zero  for  a non-negative  number, 
and  it  is  one  when  the  number  is  negative. 

Figure  4.19  shows  the  relationship  between  the  sign  bits 
and  the  related  quadratic  curves.  From  the  associated  Table 
(a)  we  derive  the  switch  relationship: 

switch  = Db  + Dc=  D{b+c)  (4.16) 

When  the  switch  is  TRUE  the  coefficients  b and  c,  as  well 
as  the  volume  identifier  are  moved  to  the  output  buffer  of 
this  stage.  When  the  switch  is  FALSE,  the  output  will  be 
overwritten  by  the  result  of  the  next  test. 

Serial  connection  of  the  BVCG  and  the  RPT  will  not 
complete  the  intersection  pipeline.  Each  bounding  volume  must 
be  processed  at  the  same  level  of  this  pipeline.  To  keep  the 
relevant  information  we  need  to  employ  registers  in  BVCG  as 
well  as  in  RPT.  Also  in  the  Ray  Intersector  registers  are 
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needed  to  keep  track  of  the  volume  identifier  numbers.  With 
these  considerations,  the  ray  intersector  pipeline  is  shown 
implemented  in  Figure  4.18(d).  The  pipeline  has  six  stages. 
The  first  stage  forms  the  difference  between  the  ray  origin 
and  the  center  of  the  sphere.  The  coefficients  b and  c in 
equations  (4.5)  and  (4.6)  are  generated  in  the  second  through 
fourth  stages.  The  discrimination  equation  (Eq.  4.8)  is 
performed  at  the  fifth  and  sixth  stages.  Also  in  the  sixth 
stage  the  root  position  test  is  performed. 

The  second  and  fifth  stages  are  multipliers  and  the  other 
stages  are  adders /subtracters . When  this  pipeline  is 
implemented  we  need  to  consider  the  throughput  of  the  pipeline 
and  the  required  machine  cycle  times.  The  throughput  is  the 
rate  of  completion  of  instructions  in  the  pipeline.  Since  the 
pipeline  stages  are  sequential,  all  stages  must  operate  at  the 
same  cycle  time.  The  cycle  time  is  the  instruction  completion 
time  and  the  time  required  to  initialize  the  next  stage.  The 
maximum  rate  of  this  operation  is  determined  by  the  slowest 
pipeline  stage. 

Multiplication  could  be  accomplished  with  a massive 
parallel  multiplier.  However,  hardware  considerations 
normally  limit  multiplication  hardware  to  be  no  more  than 
repeated  additions.  Hence  we  can  infer  the  slowest  pipeline 
stage.  Stage  six  consists  of  a subtractor  and  two  levels  of 
logic  gates.  Generally  subtraction  is  much  faster  than 
multiplication,  so  the  work  in  stage  six  can  be  performed 
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during'  'ths  inultiplica'tion  tiins  nB©d©d  in  ■th©  oth©]r  s'tag©©. 
H©nc©  th©  p©rformanc©  of  th©  ray  int©rs©ctor  will  b©  limit©d 
by  th©  multipli©r  tim©s  r©quir©d  in  stag©s  two  and  fiv©. 

Th©  ray  int©rs©ction  hardwar©  produc©s  a ray  int©rs©ction 
r©sult  p©r  ©ach  multiplication  ©x©cution  tim©.  H©nc©  to 
procBss  N bounding  sph©r©s,  th©  hardwar©  will  ne©d  tim© 
©quivalent  to  approximat©ly  N multiplications. 

4.7.3  Th©  D©pth  Sorter 

W©  can  easily  compare  th©  depth  of  volumes  from  th© 
characteristics  of  th©  spheres,  th©  centers  and  radii.  The 
basic  idea  of  th©  depth  sorter  is  embedded  in  th©  state 
variables  on  Section  4.5.1.  The  block  diagram  of  th©  depth 
sorter  is  shown  in  Figure  4.20.  The  information  about  the 
sphere  bounding  volume  which  intersects  the  ray  is  fetched 
from  the  output  buffer  of  the  previous  stage  and  is  moved  to 
one  of  the  input  registers  (tempA  or  tempB.)  The  selection  of 
which  buffer  to  use  is  made  by  the  output  of  the  Select  Module 
(SM).  When  the  output  of  the  SM  is  0,  tempA  is  used; 
otherwise  tempB  is  used.  The  input  buffer  will  receive  the 
information  about  the  bounding  volume  (which  intersected  the 
ray.)  Depth  comparisons  will  be  continued  until  the  last 
buffer  information  is  received.  The  control  unit  issues  an 
" end-of-list " signal  on  completion  of  the  bounding  volumes 
list.  The  results  from  the  depth  comparisons  are  stored  in 
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SVG:State  Variable  Generator 


Figure  4.20  Block  Diagram  of  Depth  Sorter 
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a buffer  through  a 2:1  MUX.  Figure  4.13  explains  the  data 
flow  and  the  data  storage  in  the  depth  sorter. 

The  key  function  of  the  depth  sorter  is  depth  comparison. 
There  are  two  modules  involved  in  this.  One  module  (SVG) 
generates  the  state  variables  Xi,  X2,  X3,  X4,  yi,  and  The 
state  variables  are  generated  from  the  results  of  the  earlier 
calculations.  Using  these  state  variables,  the  other  module 
(Select)  selects  the  particular  temp  register  (tempA  or 
tempB) . 

Figures  4.21  (a),  (b),  (c)  and  (d)  show  the  calculation 
of  the  state  variables,  x,  results  from  a simple  subtraction 
between  the  two  b's.  The  calculations  for  X2  and  Xj  require 
three  steps.  Each  row  represents  one  step.  Figures  4.21  (b) 
and  (c)  indicate  left  shift  operations.  Twice  an  integer 
value  is  that  value  left  shifted  by  one  place.  For  floating 
point  values  the  same  is  achieved  by  adding  one  to  the 
exponent  of  a radix-2  floating  point  number.  Although  he 
algorithms  indicate  multiplications,  we  get  the  same  results 
from  shifting  (which  is  equivalent  to  bus-wire  rerouting)  or 
from  a short-word  integer  addition.  The  state  variable  X4  is 
the  result  of  4 operations . This  is  the  most  complicated  of 
the  calculations  that  go  into  producing  the  state  variables . 
The  time  taken  by  the  state  variable  generator  depends  on  the 
calculation  of  X4. 
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Figure  4.21  State  Variable  Xj  and  X2 


116 


Ci  C2 

4^ 


X3 


b2  Cl  bi  C2  t>2  bi 

m4 


W 


t 

X4 

(d) 


Figure  4.21  Continued  (X3  and  X4) 
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The  following  are  the  operations  necessary  for  the  state 
variables  x,  through  X4: 


Xl  = b^-b^ 

(4.17) 

X2  - {bi  ~b2 ) ~4  {0^-02) 

(4.18) 

X3  = b,b2-2  (C1+C2) 

(4.19) 

ib2C^-b^C2)  U?i-b2) -(Ci-C2)2 

(4.20) 

Since  the  comparisons  of  the  state  variables  are  always  with 
the  value  zero,  we  can  use  the  sign  bits  of  Xi,  X3,  and  X4 
directly  in  the  indicated  operations.  For  Xj  the  sign  needs 
to  be  inverted,  potentially  a trivial  hardware  solution,  using 
at  most  a single  gate. 

The  variables  y are  treated  similarly  as  the  variables  x. 
With  IEEE  floating  point  numbers,  for  the  representation  of 
zero,  the  mantissa  part  is  always  zero,  and  when  the  number  is 
not  zero,  the  first  bit  of  the  mantissa  is  not  zero.  Using 
this  property  the  state  variables  y are: 

y = si  (4.21) 

where  s:  sign  hit  of  coefficient  c 

f:  first  bit  of  c's  mantissa  part 

The  relationship  between  the  state  variables  x,  through 
X4  is  summarized  in  Figure  4.21.  In  the  pipeline,  the  state 
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Figure  4.22  State  Variable  Y 


119 


variable  y,  is  obtained  from  c,  in  tempA  and  yj  is  obtained 
from  C2  in  tempB. 

The  relationships  between  the  various  x state  variables 
are  presented  in  Figure  4.23.  This  figure  also  details  the 
steps  we  need  to  take  to  design  this  SVG(  State  Variable 
Generator  ) as  a pipeline.  The  process  requires  four  steps. 
Note  that  X,  is  completed  in  stage  1,  but  X4  is  not  complete 
until  stage  4.  Hence  the  selection  can  not  be  completed  until 
stage  4 is  finished.  Because  of  the  basic  characteristics  of 
a sort  function,  it  is  not  easily  implemented  in  a pipeline. 

The  six  state  variables  are  generated  in  the  SVG  module, 
and  those  six  are  input  into  the  Select  module,  which  performs 
simple  logic  operations.  Figure  4.24  presents  the  logic 
diagrams  of  the  Select  module.  We  replace  the  AND  gate  A in 
Figure  4.24  with  an  XOR  gate  by  replacing  the  negation  gate  in 
Figure  4.23.  Thus  the  depth  sorter  in  Figure  4.23  will  sort 
the  depths  of  the  sphere  bounding  volumes  using  only  two 
modules,  SVG  and  Select.  The  ray  tracing  algorithm  uses  only 
information  in  memory  buffers  where  information  about  those 
depth  sorted  volumes  is  stored.  The  number  of  bounding 
volumes  which  are  employed  in  the  ray  tracing  algorithm 
depends  on  the  characteristics  of  the  hardware;  ideally  it 
should  hold  in  its  own  buffers  all  the  information  needed  for 
all  images  it  is  to  handle. 
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Figure  4.23  Relationship  between  Variables 
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Figure  4.24  Select  Module 


CHAPTER  5 

EXPERIMENTAL  RESULTS 


The  main  problem  of  the  ray  tracing  algorithm  is  the 
number  of  ray  intersections  with  the  various  objects  in  the 
scene.  To  find  one  intersection  point,  the  ray  must  perform 
the  intersection  tests  with  all  the  objects  in  the  image 
space.  This  is  the  most  severe  bottleneck  in  the  ray  tracing 
algorithm.  The  traditional  fast  algorithms  attack  this 
bottleneck  by  employing  bounding  volumes  or  by  partitioning 
the  image  space.  Each  algorithm  has  its  own  limitations  as 
explained  in  chapter  3.  To  overcome  these  problems  (including 
the  problem  of  each  fast  algorithm  and  that  of  the 
bottleneck),  we  suggested  a new  algorithm  in  chapter  4 for  the 
general  ray  tracing  procedure.  In  this  chapter  we  present  the 
results  of  extensive  simulations  of  the  performance  of  the 
proposed  algorithm.  We  pay  special  attention  to  the  following 
two  aspects : 

1.  the  location  of  the  bottleneck  in  the  procedure; 

2.  performance  of  the  new  algorithm  at  the  bottleneck. 

This  simulation  was  performed  on  an  RS6000  workstation. 
To  compare  the  performance  between  the  original  ray  tracing 
algorithm  and  the  new  algorithm,  we  employ  simple  primitives. 
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The  primitives  are  the  sphere,  the  plane,  and  the  triangle.  In 
the  simulation  we  employ  two  models.  One  model  consists  of  2 
planes  and  264  spheres,  the  other  model  consists  of  4096 
triangles . 

The  big  difference  between  the  two  models  is  that  one  has 
a small  number  of  objects  in  the  image  space  and  the  other 
model  has  a large  number  of  objects.  The  simulation  is  first 
performed  using  a small  image  resolution  (80x50),  then  the 
tests  are  repeated  with  a much  higher  resolution  (640x400). 
Analyzing  the  two  sets  of  results,  we  find  relations  between 
the  two  models.  For  convenience  we  give  a name  to  each  model. 
The  first  model,  which  has  the  small  number  of  primitives  is 
called  "trees"  (see  Figure  5.1).  The  second  model,  shown  in 
Figure  5.2,  is  called  "delta".  The  resolution  of  both 
displayed  pictures  is  1024  x 768. 

Before  we  calculate  these  pictures  we  simulate  both  at  a 
lower  resolution.  Because  computation  of  a high  resolution 
image  takes  a long  time,  we  start  from  a small  resolution  in 
order  to  save  computation  time  (and  to  check  for  possible 
program  errors ) . 

In  these  simulations,  we  measure  the  initialization  time, 
the  ray  tracing  time  and  the  number  of  ray  intersections  with 
the  objects  for  each  model  and  given  image  resolution.  Before 
starting  ray  tracing,  the  algorithm  needs  to  read  the  objects 
to  link  each  other  or  to  link  them  to  some  bounding  volumes. 
The  time  taken  by  this  step  is  called  the  initialization  time. 
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Figure  5.1  Trees 
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Figure  5.2  Delta 
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After  initializing  the  environment,  the  algorithm  traces 
the  rays  to  draw  the  entire  image  on  the  screen.  The  time 
taken  by  this  step  is  called  the  ray  tracing  time.  Generally 
ray  tracing  time  consists  of  four  parts: 

1.  ray  time  : total  time  for  making  new  rays; 

2.  trace  time  : total  time  for  ray-object  intersections; 

3.  shade  time  : total  time  for  color  selection  at  the  ray 
intersection  point; 

4.  write  time  : total  time  for  writing  the  output  to  some 
file . 

To  measure  the  times  listed  above  , we  use  two  different 
times  (calendar  time  and  process  time).  We  record  the  start 
time  and  the  end  time.  By  taking  the  difference  between  these 
two  times  we  measure  the  process  time.  In  some  cases  the 
measured  total  process  time  results  in  an  overflow  because  the 
process  time  resolution  is  very  small  and  the  processing  time 
is  very  long.  In  these  cases  we  use  the  calendar  time  (wall- 
clock  time)  to  measure  the  performance. 

When  we  open  a work  station,  we  find  that  there  are  many 
processes  that  run  on  the  system.  Even  though  we  remove  the 
unnecessary  processes  to  measure  the  exact  processing  time, 
still  there  are  some  processes  which  will  be  included  in  the 
system  time.  Since  the  CPU  distributes  the  time  to  each 
process,  the  measured  process  time  has  some  variation  for  the 
same  job.  For  example  the  initialization  time  in  Figure  5.3 
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must  be  the  same  as  the  time  for  the  initialization  of  that  of 
Figure  5.6.  (The  figures  are  grouped  at  the  end  of  this 
chapter.)  But  a slight  difference  is  found  when  we  compare  the 
corresponding  two  columns  in  the  table  of  measurements . 

This  small  deviation  is  not  an  important  factor  when  we 
analyze  the  perfoinnance  of  each  algorithm.  The  dominant  factor 
(trace  time)  is  not  affected  by  this  deviation.  When  we 
measure  the  calendar  time,  it  shows  more  time  than  the 
process  time,  because  calendar  time  includes  the  clock 
function  call  time,  which  then  gives  the  process  time.  In  the 
tables,  a "0"  for  the  number  of  bounding  volumes  means  that 
ray  tracing  is  performed  by  the  original  algorithm,  shown  in 
Figures  5.3,  5.6,  5.9,  5.12.  Based  on  section  4.4,  automatic 
bounding  is  performed  on  the  model  "trees". 

When  we  compare  the  initialization  time  of  two 
resolutions  (80x50,  640x400),  we  find  that  initialization 

stage  is  not  affected  by  the  resolution.  But  ray  tracing  time 
depends  on  the  image  resolution.  For  a small  window ( 80x50 ) , 
initialization  time  takes  a big  portion  of  whole  process  time. 
For  example,  initialization  is  10%  of  the  whole  processing 
time  for  the  case  where  many  bounding  volumes  are  used,  as  in 
the  last  row  of  Figure  5.3.  But  the  initialization  time  shown 
in  the  last  row  of  Figure  5.6  is  a smaller  portion  of  the 
total  processing  time  than  that  shown  in  Figure  5.3.  Since  the 
initialization  stage  is  performed  in  the  image  space,  this 
time  is  not  affected  by  resolution  but  is  affected  by  the 
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number  of  bounding  volumes.  The  amount  of  initialization  time 
is  almost  constant. 

The  process  time  consists  of  4 times  as  we  remarked 
before.  Figure  5.4  (a)  compares  the  initialization  times  and 
each  component  of  the  ray  tracing  time.  Figure  5.4  (b) 
compares  the  same  things  as  Figure  5.4  (a)  except  the  trace 
time.  The  other  factors  are  not  significant  in  Figure  5.4(b). 
This  phenomenon  is  the  same  in  Figures  5.7  and  5.13.  When  we 
compare  the  corresponding  4 columns  in  Figures  5 . 3 and  Figure 
5.6,  trace  time  is  found  to  be  the  dominant  factor  for  ray 
tracing  calculations.  In  the  traditional  ray  tracing  algorithm 
ray-object  intersection  tests  are  done  during  trace  time.  In 
the  new  algorithm  not  only  the  ray-object  intersection  tests 
but  also  the  sorting  of  bounding  volumes  are  performed  during 
trace  time. 

Trace  time  depends  on  the  number  of  ray-object 
intersections  in  the  traditional  ray  tracing  algorithm.  Since 
we  employ  more  bounding  volumes,  there  are  fewer  ray-object 
intersections.  The  ray  tracing  time  (also  trace  time) 
decreases  with  the  increasing  the  number  of  bounding  volumes 
until  some  limiting  number  of  bounding  volumes  is  reached. 
When  the  number  of  bounding  volumes  is  then  increased,  the 
process  time  increases.  The  reason  is  that  the  sorting 
procedure  for  bounding  volumes  takes  more  time  than  the  ray- 
object  intersection  time.  Figure  5.5  shows  trace  time  and  the 
number  of  ray-object  intersection  relations. 
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The  basic  ideas  of  the  new  ray  tracing  algorithm  are  the 
following: 

1 . sort  procedure  for  bounding  volume  is  performed  by 
Depth  Sorter  (hardware) 

2.  ray-object  intersection  is  performed  by  software. 

Thus  in  the  new  algorithm  sorting  time  can  be  ignored  if 
we  employ  a reasonably  small  number  of  bounding  volumes.  The 
trace  time  (also  ray  tracing  time)  depends  on  only  the  number 
of  ray-object  intersections.  When  the  light  sources  are  in  the 
image  space,  the  time  taken  for  shading  calculations  depends 
on  the  number  of  ray-object  intersections,  because  the 
procedure  for  checking  whether  the  intersection  point  is  in 
the  shadow  or  not  reguires  ray-object  intersections.  Ray  time 
and  write  time  do  not  depend  on  the  number  of  bounding 
volumes . 

In  this  simple  model,  "trees",  we  did  not  put  any  light 
sources  in  the  space  occupied  by  the  object.  Shade  time  is  not 
affected  by  the  number  of  bounding  volumes.  Using  software 
simulation  only  we  reduce  the  ray  tracing  time;  it  is  THREE 
times  faster  than  that  of  the  traditional  ray  tracing 
algorithm.  If  we  put  a few  light  sources  in  the  image  space, 
we  get  more  efficient  results  than  that  of  Figure  5. 3, 5. 6.  The 
model  "delta"  has  4096  triangles.  The  image  of  this  model  is 
calculated  for  two  resolutions  (80x50,  640x400).  To  get  simple 
comparisons  we  did  not  put  light  sources  in  this  model.  As  we 
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expected  the  number  of  ray-object  intersection  is  decreasing 
when  we  increase  the  number  of  bounding  volumes.  This  is 
represented  by  Figures  5,11  and  5.14.  Because  of  the  behavior 
of  the  sort  procedure  for  bounding  volumes,  trace  time 
increases  when  the  number  of  bounding  volumes  exceeds  442 . 
Using  only  software  simulation  we  get  16  times  faster  result 
than  that  of  the  traditional  algorithm  for  model  "delta" . When 
the  number  of  bounding  volumes  is  794  the  initialization  time 
takes  more  time  than  the  tracing  time  in  Figure  5.3,  But  this 
initialization  time  is  not  a big  portion  of  the  total  time, 
shown  in  Figure  5.12.  The  initialization  time  depends  not 
only  on  the  number  of  bounding  volumes  but  also  on  the  number 
of  objects  in  the  image  space. 

In  comparing  the  pictures  produced  by  using  the  standard 
procedure  and  by  the  new  algorithm,  yet  another  test  must  be 
made.  We  must  compare  the  pictures  pixel  by  pixel  to  assess 
their  fidelity.  In  the  following  we  discuss  the  results  for 
this  comparison.  Two  image  files  are  generated  by  the 
traditional  ray  tracing  algorithm  and  the  new  algorithm.  The 
resolution  of  both  images  is  1024x768.  The  new  algorithm  uses 
42  bounding  volumes  to  produce  the  image  file. 

A pixel  of  each  image  file  is  a 24  bit  color 
specification.  We  compare  the  two  files  pixel  by  pixel.  The 
number  of  color  pixels  is  176,703.  Of  these  263  pixels  color 
pixels  are  different.  To  analyze  these  pixels,  we  highlight 
those  pixels  with  white  color  as  shown  in  Figure  5.16. 
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Aliasing  and  round  off  error  are  the  main  reasons  for  the 
difference.  Let's  consider  the  aliasing  problem  using  Figure 
5.15.  Two  planes  A,B  are  intersected  at  I.  The  ray  R hits  the 
intersection  point  I.  If  plane  A is  Red,  and  plane  B is  Blue, 
then  what  color  must  be  selected  by  the  ray  R?  If  the 
objects  are  linked  by  linklist  1,  then  the  ray  R selects  color 
A.  If  the  objects  are  linked  by  linklist  2,  the  ray  R returns 
color  B.  If  the  linklist  order  of  traditional  algorithm  is 
linklist  1 and  the  new  algorithm  uses  link  list  2,  a pixel 
difference  occurs  at  the  intersection  point  when  we  compare 
the  two  files.  144  pixels  of  176,703  pixels  is  0.08%  of  the 
total  color  pixels.  This  number  is  reasonable  because  image 
files  are  produced  by  two  different  algorithms.  Generally  to 
overcome  the  aliasing  problem,  we  may  use  a super-sampling 
algorithm.  Super-sampling  can  also  be  applied  to  the  new 
algorithm  to  reduce  aliasing  problems. 

The  whole  processing  time  of  the  original  algorithm  is  21 
hours  24  minutes  17  seconds.  The  new  algorithm  with  545 
bounding  volumes  takes  1 hour  32  minutes  21  seconds  to  produce 
1024x768  resolutions. 


Resolutions  : 80  x 50 
Model  : Trees 
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Figure  5.3  Experimental  Results  for  Trees  (80x50) 
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Figure  5.4  Process  Time  for  Trees  (80x50) 
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Figure  5.5  Trace  Time  and  Number  of  Intersections 

for  Trees  (80x50) 


Resolutions  : 640  x 400 
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Figure  5.6  Experimental  Results  for  Trees  (640x400) 
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Figure  5.7  Process  Time  for  Trees  (640x400) 


Trace  time 

H on  nnn 

lOUjUVJVJ 

160,000 

- ^ 

140,000 

120,000 

100,000 

- * \ 

80,000 

t 

\ 

60,000 

t 

V 

**  . / 

40,000 

20,000 

1 1 

1 1 1 1 1 I 1 1 

0 2 

4 8 12  18  30  100  140  208 

# of  BVs 

# of  intersections 


unit  of  Y axis  : 10  million 


Figure  5.8  Trace  Time  and  Number  of  Intersections 

for  Trees  (640x400) 


Resolutions  : 80  x 50 
Model  : Delta 
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Figure  5.9  Experimental  Results  for  Delta  (80x50) 
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Figure  5.10  Process  Time  for  Delta  (80x50) 
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Figure  5.11  Trace  Time  and  Number  of  Intersections 

for  Delta  (80x50) 


Resolutions  : 640  x 400 
Model  : Delta 
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Figure  5.12  Experimental  Results  for  Delta  (640x400) 
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Figure  5.13  Process  Time  for  Delta  (640x400) 
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Figure  5.14  Trace  Time  and  Number  of  Intersections 

for  Delta  (640x400) 
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Figure  5.15  Aliasing 
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Figure  5.16  Differences  between  two  outputs 


CHAPTER  6 
CONCLUSION 

6 . 1 Summary 

The  objective  of  this  study  was  to  develop  a new  fast 
algorithm  for  ray  tracing  and  to  specify  hardware  which  will 
substantially  ease  computational  bottlenecks  in  the  ray 
tracing  procedure. 

The  algorithm  which  was  developed  employs  sphere  bounding 
volumes  to  eliminate  having  to  check  for  the  possible 
intersection  of  each  object  with  each  ray  that  characterizes 
the  pixels  of  the  final  image.  Traditional  bounding  volumes 
are  used  to  bound  objects.  The  sphere  bounding  volumes  of  the 
new  algorithm  are  used  to  bound  subspaces  which  could  have 
whole  objects  and/or  parts  of  an  object.  The  sphere  bounding 
volumes  are  sorted  with  respect  to  ray  direction  for  each  ray 
in  order  to  find  the  object  nearest  to  the  ray  origin. 
Traditional  sorting  of  sphere  bounding  volumes  needs  to 
calculate  square  roots.  To  avoid  square  root  calculations,  we 
developed  a comparison  algorithm  which  uses  coefficients  of 
quadratic  equations  for  sorting  bounding  volumes.  In  the 
traditional  algorithms  the  computational  bottleneck  is  caused 
by  the  high  number  of  ray-object  intersections.  In  the  new 
algorithm,  ray-object  intersection  test  starts  from  the 
nearest  bounding  volume.  If  a ray  hits  an  object  in  the 
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bounding  volume,  the  intersection  test  is  terminated.  If  not, 
the  intersection  test  considers  the  next  nearest  bounding 
volume . 

This  algorithm  gives  us  a smaller  number  of  ray-object 
intersections  when  we  employ  more  bounding  volumes.  Even 
though  the  number  of  intersections  is  decreased,  the 
computational  requirements  for  sorting  sphere  bounding  volumes 
is  increased.  To  eliminate  this  extra  load  we  use  a depth 
sorter,  which  can  potentially  be  implemented  in  simple 
hardware . 

The  basic  ideas  of  the  new  algorithm  consists  of  the 
following: 

1.  Sorting  bounding  volumes  are  done  by  the  depth  sorter; 

2.  Ray-object  intersection  is  done  by  software. 

The  performance  of  the  new  algorithm  was  verified  by  extensive 
computer  simulations.  When  we  employ  many  bounding  volumes, 
the  number  of  ray-object  intersections  is  decreased  at  the 
expense  of  sphere  intersections.  Since  the  new  procedure 
greatly  simplifies  the  sphere  sorting  process,  there  is  a 
substantial  saving  in  overall  computation  time,  amounting  to 
a six-fold  decrease  in  many  of  the  test  cases.  To  verify  the 
new  algorithm  we  compared  two  outputs  which  are  produced  by 
each  of  the  algorithms  (traditional  ray  tracing  algorithm  and 
new  algorithm) . 
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6.2  Remarks  on  the  New  Algorithm 

This  new  algorithm  was  simulated  on  an  RS  6000 
workstation.  The  C programming  language  was  used  for 
implementation.  Even  without  hardware  implementation  this  new 
algorithm  shows  a noticeable  reduction  of  computational 
requirements.  If  the  model  includes  light  sources,  or 
refraction,  then  the  new  algorithm  shows  even  more  of  a 
computational  improvement.  Although  designed  for  hardware 
implementation,  this  new  algorithm  can  be  used  for  speeding  up 
the  general  ray  tracing  process. 

The  new  algorithm  need  not  build  bounding  volume 
hierarchies  in  the  initialization  stage,  because  hierarchies 
with  respect  to  ray  direction  are  built  by  the  sorter  for  each 
ray.  Glassner's  algorithm  [Gla  84]  cannot  efficiently  utilize 
memory  for  subspace  divisions.  Fujimoto's  algorithm  needs  a 
very  large  memory  because  it  needs  memory  assignments  even  for 
empty  subspaces.  Since  the  subspace  which  has  no  objects  or 
any  part  of  an  object  needs  no  bounding  volume,  much  smaller 
memory  spaces  are  needed  than  that  required  by  the  spatial 
subdivision  algorithm.  The  reduced  memory  requirements  are 
advantages  of  the  hierarchical  bounding  volume  algorithm  and 
of  the  spatial  subdivision  algorithm.  The  new  algorithm  may  be 
applied  to  any  kind  of  object  model. 

Even  though  the  hardware  discussed  is  for  bounding  volume 
sorting,  any  kind  of  primitive  can  be  included  in  the  new 
algorithm.  Because  the  sort  is  performed  not  on  the  objects. 
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but  on  bounding  volumes,  we  can  easily  extend  the  ray  tracing 
algorithm. 

This  new  algorithm  shows  great  speed  improvements,  needs 
a relatively  small  fixed  memory  space  for  the  bounding  volumes 
and  also  gives  expandability  whether  implemented  in  hardware, 
or  not. 


6.3  Recommendations  for  Future  Research 

Many  follow  up  studies  are  possible.  Specially,  the 
studies  in  automatic  assignment  of  the  proper  number  of 
bounding  volumes,  and  more  efficient  bounding  algorithms  may 
yield  further  improvements . Several  aspects  of  various 
bounding  algorithms  are  considered  in  chapter  4.  To  get  more 
computational  efficiency  we  need  a robust  sphere  bounding 
algorithm  based  on  following  considerations: 

1.  reduction  of  void  areas; 

2.  reduction  of  overlap  areas. 

To  solve  these  problems,  we  must  understand  how  many 
bounding  volumes  are  optimum  for  a given  model,  and  when  the 
memories  for  bounding  volumes  are  fixed,  how  to  subdivide  the 
image  space  for  efficient  use  of  the  available  memory.  To 
devise  a computationally  efficient  procedure  for  these  tasks 
one  needs  yet  another  bounding  algorithm. 
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APPENDIX 


Let  tj,t2  be  the  smallest  positive  root  of  each 
fi(t),f2(t)  as  shown  in  Fig  4.8: 

t,  = (-b,-D,‘^')/2 

t2  = (-b2-D2‘'')/2 

Let's  find  the  smallest  t,  without  using  the  exact  calculation 
of  t.  This  may  be  done  by  just  comparing  the  coefficients  b,c 
to  sort  the  depth  of  bounding  volumes  with  respect  to  the  ray 
path . 

Assume  that  tj>t2 

(-b,-Di*^2)>(-b2-D2‘'2)  b2-b,  > 

i)  b2-bi>0,  Di‘'^-D2*^^<0 

select  t2 

ii)  b2-b,>0, 

( Di*^^-D2‘^^^0  Di*^2>D2'^^  « bi^-4Ci>b2^-4c2  ) 

(b2-bi)^>(Di‘'^-D2^'^)^  b2^-2bib2+bi^>Di+D2-2D,‘'^D2'^^  <=> 

2Di‘^^D2*^^>2  (bib2-2c,-2c2) 
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if  (bib2-2c,-2c2)<0  , then  select  t2 
if  (bib2-2ci-2c2)^0  and 

(b,^-4c,)  (b2^-4c2)>(bib2-2ci-2c2)^ 

(b,C2-b2Ci)  (b2-bi)>(Ci-C2)^ 

then  select  t2 

(a) 

if  (bib2-2c,-2c2)^0  and 

(b,^-4ci)  (b2^-4c2)  = (bib2-2ci-2c2)^  ^ 

(biC2-b2Ci)  (b2-bi)  = (Ci-C2)^ 

then  select  tj  (V  ti=t2) 

(b) 


From  (a) , (b) 

(b,C2-b2C,)  (b2-b,)^(c,-C2)^ 

then  select  t2 

From  conditions  (a)  and  (b)  we  can  construct  the  state 
variable  X4=0. 


if  (b]b2-2c,-2c2)>0  and 
(b,^-4c,)  (b2^-4c2)<(bib2-2ci-2c2)^  ^ 
(b,C2-b2C,)  (b2-b,)<(c,-C2)^ 

then  select  t. 
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iii)  b2“bi<0, 

( Di‘^^-D2‘'^<0  ^ bi^-4Ci<b2^-4c2  ) 

(b2-b,  )^<(Di*^2-D2*^^)^  b2^-2bib2+bi^<Di+D2-2Di^^^D2‘^^  <=^ 

2D/'^D2‘^^<2  (bib2-2c,-2c2) 

if  (bib2“2ci-2c2)<0  , then  select  tj 

if  (b,b2“2ci-2c2)^0  and 
(bi^-4ci)  (b2^-4c2)<(bib2-2c,-2c2)^  ^ 

(biC2-b2Ci)  (b2-bi)<(c,-C2)2 

then  select  t2 

if  (bib2-2c,-2c2)^0  and 
(bi^-4ci)  (b2^-4c2)>(bib2-2ci-2c2)^  ^ 

(biC2-b2Ci)  (b2-bi)>(Ci-C2)^ 

then  select  ti 

(c) 


if  (bib2-2ci-2c2)>0  and 
(bi^-4ci)  (b2^-4c2)  = (bib2-2ci-2c2)^  ^ 
(biC2-b2Ci)  (b2-bi)  = (Ci-C2)^ 


then  select  ti  (V  tj=t2) 

(d) 


From  conditions 
variable  X4=0. 


From  (c),(d) 

(b,C2-b2C,)  (b2-b,)^(c,-C2)^ 

then  select  t, 

(c)  and  (d)  we  can  construct  the  state 


iv)  b2“bj<0, 

assumption  is  wrong,  select  t. 
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